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Applicant herewith submits to the United States Designated/Elected Office (DO/EO/US) the following items and other information: 

1 . G2 This is a FIRST submission of items concerning a filing under 35 U.S.C. 371 . 

2. CD This is a SECOND or SUBSEQUENT submission of items concerning a filing under 35 U.S.C. 371. 

3. G3 This express request to begin national examination procedures {35 U.S.C. 371(f)) at any time rather than delay examination until the exc 

the applicable time limit set in 35 U.S.C. 371(b) and the PCT Articles 22 and 39(1). 

4. t3 A proper Demand for International Preliminary Examination was made by the 1 9th month from the earliest claimed priority date. 

5. C3 A copy of the International Application as filed (35 U.S.C. 371 (c)(2)) 

a. 0 is transmitted herewith (required only if not transmitted by the International Bureau). 
M b. [3 has been transmitted by the International Bureau. 

y3 r-i 

% | c. L_J is not required, as the application was filed in the United States Receiving Office (RO/US) 
A translation of the International Application into English {35 U.S.C. 371(c)(2)). 

7-y;| D Amendments to the claims of the International Application under PCT Article 19 (35 U.S.C. 371(c)(3)) 
y,| a. LZ3 are transmitted herewith (required only if not transmitted by the International Bureau) 

have been transmitted by the International Bureau. 
Q c. LZ3 have not been made; however, the time limit for making such amendments has NOT expired 
have not been made and will not be made. 

8. j]j L_l A translation of the amendments to the claims under PCT Article 19 (35 U.S.C. 371(c)(3)). 

r ^ 

9. ^ Q An executed oath or declaration of the inventor(s) (35 U.S.C. 371(c)(4)). 

10. □ A translation of the annexes to the International Preliminary Examination Report under PCT Article 36 (35 U.S.C. 371(c)(5)). 
Items 11. to 16. below concern other document(s) or information included: 

11. D An Information Disclosure Statement under 37 CFR 1 .97 and 1 .98. 

12. □ An assignment document for recording. A separate cover sheet in compliance with 37 CFR 3.28 and 3.31 is included. 

13. O A FIRST preliminary amendment. 

D A SECOND or SUBSEQUENT preliminary amendment. 

14. □ A substitute specification. 

15. □ A change of power of attorney and/or address letter. 

16. □ Other items or information: 
International Preliminary Examination Report; Written Opinion; International Search Report; Petition To Revive 
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040 n8C 0 rUi/FTO 02 JAN 2001 



U.S. APPLICATION NO. (If known, see 37 C.F.R. 1.50) 



09/720934 



INTERNATIONAL APPLICATION NO. 

PCT/US99/08371 



ATTORNEY'S DOCKET NUMBER 

2320-1-001 PCT/US 



17. 



□ 



The following fees are submitted: 



Basic National Fee (37 CFR 1.492{a){1)-{5)): 

Search Report has been prepared by the EPO or JPO $860.00 

International preliminary examination fee paid to USPTO (37 CFR 1.482) 

... ... $690.00 

No international preliminary examination fee paid to USPTO (37 CFR 1.482} 

but international search fee paid to USPTO (37 CFR 1.445(a)(2)) $710.00 

Neither international preliminary examination fee (37 CFR 1.482) nor 
international search fee (37 CFR 1.445(a)(2)) paid to USPTO 



$1,000.00 



International preliminary examination fee paid to USPTO (37 CFR 1.482) 
and all claims satisfied provisions of PCT Article 33(2)-{4) 



CALCULATIONS 



$ 100.00 

ENTER APPROPRIATE BASIC FEE AMOUNT = 



860.00 



Surcharge of $130.00 for furnishing the oath or declaration later than I 1 on I I 

months from the earliest claimed priority date (37 CFR 1.492(e)). 



30 



Claims 



Number Filed 



Number Extra 



Rate 



Total Claims 



57 -20 = 



37 



X $18.00 



666.00 



Independent Claims 



11 -3 = 



X $80.00 



640.00 



IVN-jtiple dependent claim(s) (if applicable) 



+ $270.00 



.00 



TOTAL OF ABOVE CALCULATIONS = 



$ 2,166.00 



Rgcjuction for 1/2 for filing by small entity, if applicable. Verified Small Entity statement must also be filed $ 
(Nfrle 37 CFR 1.9, 1.27, 1.28). Y ' 



1,083.00 



SUBTOTAL = 



$ 1,083.00 



M 



Processing fee of $130,00 for furnishing the English translation later than CD 20 LH 30 
months from the earliest claimed priority date (37 CFR 1.492(f)). 



.00 



TOTAL NATIONAL FEE = 



$ 1,083.00 



Ff#for recording the enclosed assignment (37 CFR 1.21(h)). The assignment must be accompanied bv 
agflropnate cover sheet (37 CFR 3.28, 3.31). $40.00 per property + 



40.00 



TOTAL FEES ENCLOSED = 



$ 1,123.00 



Amount to be: 
refunded 



charged 



a. Q A check in the amount of $ 1,123.00 to cover the above fees is enclosed. 

b . □ 



Please charge my Deposit Account No. 11-1153 in the amount of $ 
enclosed. 



to cover the above fees. A duplicate copy of this sheet 



□ 



Th i 6 i ^To miSS '° A n 5 r is r hereb V authorized to charge any additional fees which may be required, or credit any overpayment to Deposit Acc 
1 1-1 153 A duplicate copy of this sheet is enclosed. 



«£J « : n^™^ + aPP T r, * at + e u time l " Tlit . under 37 CFR 1494 or 1.495 has not been met, a petition to revive (37 CFR 1.137(a) or lb)) must be 
filed and granted to restore the application to pending status. 1 ' 4 " 

SEND ALL CORRESPONDENCE TO: 

DAVID A. JACKSON 
KLAUBER & JACKSON 
411 HACKENSACK AVENUE 

4TH FLOOR 
HACKENSACK, NEW JERSEY 07601 

NAME 




DAVID A. JACKSON, REG. NO. 26.742 
REGISTRATION NUMBER 



EXPRESS MAIL CERTIFICATE NO.: EL684490948US DATE OF DEPOSIT: JANUARY 2, 2001 



(9/94 



2001 



PATENT 
2320-1-001 PCT/US 



EST THE UNITED STATES PATENT AND TRADEMARK OFFICE 
APPLICANTS : Julie R. Korenberg and Xiao-Ning Chen 

SERIAL NO. : 09/720,934 

FILED : January 2, 2001 

FOR : ISOLATED SH3 GENES ASSOCIATED WITH 

MYELOPROLIFERATIVE DISORDERS AND LEUKEMIA, 
AND USES THEREOF 

STATEMENT IN SUPPORT OF THE FIT .TNG/ST I BMLSSION OF A 
O NUCLEOTIDE/AMINO ACID SEQUENCE LISTING IN 

ACCORDANCE WITH 37 CFR §§1.821 - 1.825 

2 ASSISTANT COMMISSIONER FOR PATENTS 

S BOXPCT 

?,1 WASHINGTON, DC 2023 1 

s Dear Sir; 

H DAVID A. JACKSON, attorney of record, hereby states as follows: 

a 

^ 1 . I hereby state that the content of the paper and computer readable copies of the 

s Sequence Listing submitted in accordance with 37 CFR § 1.821(c) and (e), respectively, are the 
same. 

2. I hereby state that the submission, filed in accordance with 37 CFR §1.82 1(g) 
herein does not include new matter. 



PATENT 
2320-1-001 PCT/US 



3. I hereby declare that all statements made herein of the undersigned's own 
knowledge are true and that all statements made on information and belief are believed to be 
true; and further, that these statements were made with the knowledge that willful false 
statements and the like so made are punishable by fine or imprisonment, or both, under Title 18 
of the U.S. Code, Section 1001 and that such willful false statements may jeopardize the validity 
of this Application or any patent issuing thereon. 



DATED: October 3, 2001 
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RAW SEQUENCE LISTING DATE: 11/14/2001 

PATENT APPLICATION: US/09/720,934 TIME: 14:05:23 

Input Set : A:\Sequence Listing.txt 
Output Set: N:\CRF3\11142001\I720934.raw 

3 <110> APPLICANT: Korenberg, Julie R 

4 Chen, Xiao-Ning 

6 <120> TITLE OF INVENTION: ISOLATED SH3 GENES ASSOCIATED WITH MYELOPROLIFERATIVE 

7 DISORDERS AND LEUKEMIA, AND USES THEREOF 
9 <130> FILE REFERENCE: 2320-1-001PCT 

C--> 11 <140> CURRENT APPLICATION NUMBER: US/09/720,934 p-p 
C--> 12 <141> CURRENT FILING DATE: 2001-10-03 ^ 

14 <150> PRIOR APPLICATION NUMBER: 60/082,007 <P |\| T* C 

15 <151> PRIOR FILING DATE: 1998-04-16 ^ » E 
17 <160> NUMBER OF SEQ ID NOS: 109 
19 <170> SOFTWARE: Patentln Ver. 2.0 

^.21 <210> SEQ ID NO: 1 ft „ ^ 

Q22 <211> LENGTH: 5199 C f\J $ fp O F= 

#3 <212> TYPE: DNA *— Pf £ 

%|24 <213> ORGANISM: Homo sapiens 

f?$6 <400> SEQUENCE: 1 

& 7 caaaagaatt ccgggtacgg cggctcgcga ggaagaatcc cgagcgggct ccgggacgga 60 
y? 8 cagagaggcg ggcggggatg gtgtgcgggg ctgcggctcc tgcgtccctc ccagcggcgc 120 
i2 9 ftsagcggca ctgatttgtc cctggggcgg cagcgcggac ccgcccggag atgaggcgtc 180 
*po gattagcaag gtaaaagtaa cagaaccatg gctcagtttc caacaccttt tggtggcagc 240 
1*31 ctggatatct gggccataac tgtagaggaa agagcgaagc atgatcagca gttccatagt 300 
*32 ttaaagccaa tatctggatt cattactggt gatcaagcta gaaacttttt ttttcaatct 360 
y?3 gggttacctc aacctgtttt agcacagata tgggcactag ctgacatgaa taatgatgga 420 
HB4 agaatggatc aagtggagtt ttccatagct atgaaactta tcaaactgaa gctacaagga 480 
r|5 tatcagctac cctctgcact tccccctgtc atgaaacagc aaccagttgc tatttctagc 540 
p|6 gcaccagcat ttggtatggg aggtatcgcc agcatgccac cgcttacagc tgttgctcca 600 
pi7 gtgccaatgg gatccattcc agttgttgga atgtctccaa ccctagtatc ttctgttccc 660 
8 acagcagctg tgccccccct ggctaacggg gctccccctg ttatacaacc tctgcctgca 720 
-39 tttgctcatc ctgcagccac attgccaaag agttcttcct ttagtagatc tggtccaggg 780 

40 tcacaactaa acactaaatt acaaaaggca cagtcatttg atgtggccag tgtcccacca 840 

41 gtggcagagt gggctgttcc tcagtcatca agactgaaat acaggcaatt attcaatagt 900 

42 catgacaaaa ctatgagtgg acacttaaca ggtccccaag caagaactat tcttatgcag 960 

43 tcaagtttac cacaggctca gctggcttca atatggaatc tttctgacat tgatcaagat 1020 

44 ggaaaactta cagcagagga atttatcctg gcaatgcacc tcattgatgt agctatgtct 1080 

45 ggccaaccac tgccacctgt cctgcctcca gaatacattc caccttcttt tagaagagtt 1140 

46 cgatctggca gtggtatatc tgtcataagc tcaacatctg tagatcagag gctaccagag 1200 

47 gaaccagttt tagaagatga acaacaacaa ttagaaaaga aattacctgt aacgtttgaa 1260 

48 gataagaagc gggagaactt tgaacgtggc aacctggaac tggagaaacg aaggcaagct 1320 
4 9 ctcctggaac agcagcgcaa ggagcaggag cgcctggccc agctggagcg ggcggagcag 13 80 

50 gagaggaagg agcgtgagcg ccaggagcaa gagcgcaaaa gacaactgga actggagaag 1440 

51 caactggaaa agcagcggga gctagaacgg cagagagagg aggagaggag gaaagaaatt 1500 

52 gagaggcgag aggctgcaaa acgggaactt gaaaggcaac gacaacttga gtgggaacgg 1560 

53 aatcgaaggc aagaactact aaatcaaaga aacaaagaac aagaggacat agttgtactg 1620 

54 aaagcaaaga aaaagacttt ggaatttgaa ttagaagctc taaatgataa aaagcatcaa 1680 

55 ctagaaggga aacttcaaga tatcagatgt cgattgacca cccaaaggca agaaattgag 1740 

56 agcacaaaca aatctagaga gttgagaatt gccgaaatca cccatctaca gcaacaatta 1800 

57 caggaatctc agcaaatgct tggaagactt attccagaaa aacagatact caatgaccaa 1860 
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RAW SEQUENCE LISTING DATE: 11/14/2001 

PATENT APPLICATION: US/09/720 , 934 TIME: 14:05:23 

Input Set : A:\Sequence Listing.txt 
Output Set: N:\CRE3\11142001\I720934.raw 

58 ttaaaacaag ttcagcagaa cagtttgcac agagattcac ttgttacact taaaagagcc 1920 

59 ttagaagcaa aagaactagc tcggcagcac ctacgagacc aactggatga agtggagaaa 1980 

60 gaaactagat caaaactaca ggagattgat attttcaata atcagctgaa ggaactaaga 2040 

61 gaaatacaca ataagcaaca actccagaag caaaagtcca tggaggctga acgactgaaa 2100 

62 cagaaagaac aagaacgaaa gatcatagaa ttagaaaaac aaaaagaaga agcccaaaga 2160 

63 cgagctcagg aaagggacaa gcagtggctg gagcatgtgc agcaggagga cgagcatcag 2220 

64 agaccaagaa aactccacga agaggaaaaa ctgaaaaggg aggagagtgt caaaaagaag 2280 

65 gatggcgagg aaaaaggcaa acaggaagca caagacaagc tgggtcggct tttccatcaa 2340 

66 caccaagaac cagctaagcc agctgtccag gcaccctggt ccactgcaga aaaaggtcca 2400 

67 cttaccattt ctgcacagga aaatgtaaaa gtggtgtatt accgggcact gtaccccttt 2460 

68 gaatccagaa gccatgatga aatcactatc cagccaggag acatagtcat ggtggatgaa 2 520 

69 agccaaactg gagaacccgg ctggcttgga ggagaattaa aaggaaagac agggtggttc 2580 

70 cctgcaaact atgcagagaa aatcccagaa aatgaggttc ccgctccagt gaaaccagtg 2640 

71 actgattcaa catctgcccc tgcccccaaa ctggccttgc gtgagacccc cgcccctttg 2700 

72 gcagtaacct cttcagagcc ctccacgacc cctaataact gggccgactt cagctccacg 2760 

73 tggcccacca gcacgaatga gaaaccagaa acggataact gggatgcatg ggcagcccag 2820 
f1 74 ccctctctca ccgttccaag tgccggccag ttaaggcaga ggtccgcctt tactccagcc 2880 
1J75 acggccactg gctcctcccc gtctcctgtg ctaggccagg gtgaaaaggt ggaggggcta 2940 
Jt 76 caagctcaag ccctatatcc ttggagagcc aaaaaagaca accacttaaa ttttaacaaa 3000 

7 aat 9atgtca tcaccgtcct ggaacagcaa gacatgtggt ggtttggaga agttcaaggt 3060 
Of 8 ca^agggtt ggttccccaa gtcttacgtg aaactcattt cagggcccat aaggaagtct 3120 
dj 79 acaagcatgg attctggttc ttcagagagt cctgctagtc taaagcgagt agcctctcca 3180 
yjj80 gcagccaagc cggtcgtttc gggagaagaa attgcccagg ttattgcctc atacaccgcc 3240 
ypi accggccccg agcagctcac tctcgcccct ggtcagctga ttttgatccg aaaaaagaac 3300 
j*82 ccaggtggat ggtgggaagg agagctgcaa gcacgtggga aaaagcgcca gataggctgg 3360 
^83 ttcccagcta attatgtaaa gcttctaagc cctgggacga gcaaaatcac tccaacagag 3420 
f |4 ccacctaagt caacagcatt agcggcagtg tgccaggtga ttgggatgta cgactacacc 3480 
£*5 gcgcagaatg acgatgagct ggccttcaac aagggccaga tcatcaacgt cctcaacaag 3540 
me gaggaccctg actggtggaa aggagaagtc aatggacaag tggggctctt cccatccaat 3 600 
W 7 tat ^ aa ^ c tgaccacaga catggaccca agccagcaat gaatcatatg ttgtccatcc 3660 
fi|8 ccccctcagg cttgaaagtc ctcaaagaga cccactatcc catatcactg cccagaggga 3720 
ff 9 tgatgggaga tgcagccttg atcatgtgac ttccagcatg atcacctact gccttctgag 3780 
|J0 tagaagaact cactgcagag cagtttacct cattttacct tagttgcatg tgatcgcaat 3 840 

91 gtttgagtta ttacttgcag agataggagc aaaaattaca aaaacacaca gggtagtggg 3900 

92 tccttttgtg gctttcctag ttactcaaat tgactttccc ccacctttgc acaggtgctt 3 960 

93 tcaatagttt taaaattatt tttaaatata tattttagct ttttaataaa caaaataaat 4020 

94 aaatgacttc tttgctattt tggttttgca aaaagaccca ctatcaagga atgctgcatg 4080 

95 tgctattaaa aattgttcca aatgtccata aatctgagac ttgatgtatt ttttcatttt 4140 

96 gtccagtgtt accaactaaa ttgctgcagt ttggggcttt tcccccttac catagaagtg 4200 

97 cagaggagtt cagtatctct gttttaaaga cgtatagaat gagcccaatt aaagcgaagg 4260 

98 tgattgtgct tgtttgtgtg tatcagctgt accttgttga gcatgtaata catcctgtac 4320 

99 ataagaaatt agttctttcc atggcaaagc tattaccttg tacgatgctc taatcatatt 4380 

100 geatttaatt ttattttgca acagtgacct tgtagccaca tgagaaagca ctctgtgttt 4440 

101 ttgttcggtc tcagatttat ctggttgagt tggtgttttg tttggggttt ttaattttgc 4500 

102 gtgtttgcat agcataaaat cagtagacaa caccactgag gtcgttacga tcaacgatat 4560 

103 ccacagtctc tttttagtct ctgttacatg aagttttatt ccagttactt ttcatggaat 4620 

104 gacctatttt gaacaagtaa ttttcttgac aagaaagaat gtatagaagt ctccctgcaa 4680 

105 ttaatttcca atgtttacat tttttaacta ggactgtgga atttctacag attaatatga 474 0 

106 aatggagctc atggtccgtt tgtgtgttag atatgctgta gctgaagccc tgtttgtctt 4800 



file://C:\CRF3\Outhold\VsrI720934.htm 



11/14/01 



Page 3 of 7 



RAW SEQUENCE LISTING DATE: 11/14/2001 

PATENT APPLICATION: US/09/720,934 TIME: 14:05:23 

Input Set : A:\Sequence Listing.txt 
Output Set: N:\CjRF3\11142001\I720934.raw 

107 ttaaacacta gttggaagct ctcaataaaa atgcctgctg ctcacagcac agaaaatggg 4860 

108 gcagggggag cctcaagcac aatctagctg tcctcctaaa gactctgtaa tgctcaatcc 4920 

109 ccttgcgttc tcccggcgct gtcgggaggc tgtgctggtg gtcgtgtaga ggtccttttc 4980 

110 ctttcaaatg gtgcagagag agaggacctt tcctccttgt tcagttgcaa ttcagtattt 5040 

111 tcacggatat gaatgtaaaa tatataaata tataaacctg aggatttaac aaatgtaaaa 5100 

112 caaccttttg aattagttcc gagtatagat aattaaattt ttaaaacaaa agtaaaaaaa 5160 

113 aaaaaaaaaa aaaaaaaaaa aaaagtcgac gcggccgcg 5199 

115 <210> SEQ ID NO: 2 

116 <211> LENGTH: 1143 

117 <212> TYPE: PRT 

118 <213> ORGANISM: Homo sapiens 

120 <400> SEQUENCE: 2 

121 Met Ala Gin Phe Pro Thr Pro Phe Gly Gly Ser Leu Asp He Trp Ala 

122 15 10 15 

124 He Thr Val Glu Glu Arg Ala Lys His Asp Gin Gin Phe His Ser Leu 

125 20 25 30 

^127 Lys Pro He Ser Gly Phe He Thr Gly Asp Gin Ala Arg Asn Phe Phe 
if- 28 35 40 45 

«L30 Phe Gin Ser Gly Leu Pro Gin Pro Val Leu Ala Gin He Trp Ala Leu 
\f31 50 55 60 

fLfL33 Ala Asp Met Asn Asn Asp Gly Arg Met Asp Gin Val Glu Phe Ser He 
CP 4 65 70 75 80 

yf- 36 Ala Met L YS Leu lie Lys Leu Lys Leu Gin Gly Tyr Gin Leu Pro Ser 
yjL37 85 90 95 

J-39 Ala Leu Pro Pro Val Met Lys Gin Gin Pro Val Ala He Ser Ser Ala 
*140 100 105 110 

•142 Pro Ala Phe Gly Met Gly Gly He Ala Ser Met Pro Pro Leu Thr Ala 
yL43 115 12 q 12 5 

N-4 5 val Ala Pro Val Pro Met Gly Ser He Pro Val Val Gly Met Ser Pro 

130 135 140 

pj.48 Thr Leu Val Ser Ser Val Pro Thr Ala Ala Val Pro Pro Leu Ala Asn 
Cf 49 145 150 155 160 

gj.51 Gly Ala Pro Pro Val He Gin Pro Leu Pro Ala Phe Ala His Pro Ala 
'152 165 170 175 

154 Ala Thr Leu Pro Lys Ser Ser Ser Phe Ser Arg Ser Gly Pro Gly Ser 

155 180 185 190 

157 Gin Leu Asn Thr Lys Leu Gin Lys Ala Gin Ser Phe Asp Val Ala Ser 

158 195 200 205 

160 Val Pro Pro Val Ala Glu Trp Ala Val Pro Gin Ser Ser Arg Leu Lys 

161 210 215 220 

16 3 Tyr Arg Gin Leu Phe Asn Ser His Asp Lys Thr Met Ser Gly His Leu 
164 225 230 235 240 

166 Thr Gly Pro Gin Ala Arg Thr He Leu Met Gin Ser Ser Leu Pro Gin 

167 245 250 255 
169 Ala Gin Leu Ala Ser He Trp Asn Leu Ser Asp He Asp Gin Asp Glv 
"0 260 265 270 

172 Lys Leu Thr Ala Glu Glu Phe He Leu Ala Met His Leu He Asp Val 

"3 275 280 285 

175 Ala Met Ser Gly Gin Pro Leu Pro Pro Val Leu Pro Pro Glu Tyr He 
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RAW SEQUENCE LISTING DATE: 11/14/20 01 

PATENT APPLICATION: US/09/720,934 TIME: 14:05:24 

input Set : A:\Sequence Listing.txt 
Output Set: N:\CRF3\11142001\I720934.raw 

176 290 295 300 

178 Pro Pro Ser Phe Arg Arg Val Arg Ser Gly Ser Gly He Ser Val He 

179 305 310 315 320 

181 Ser Ser Thr Ser Val Asp Gin Arg Leu Pro Glu Glu Pro Val Leu Glu 

182 325 330 335 

184 Asp Glu Gin Gin Gin Leu Glu Lys Lys Leu Pro Val Thr Phe Glu Asp 

185 340 345 350 

187 Lys Lys Arg Glu Asn Phe Glu Arg Gly Asn Leu Glu Leu Glu Lys Arg 

188 355 360 365 

190 Arg Gin Ala Leu Leu Glu Gin Gin Arg Lys Glu Gin Glu Arg Leu Ala 
1^1 370 375 380 

193 Gin Leu Glu Arg Ala Glu Gin Glu Arg Lys Glu Arg Glu Arg Gin Glu 

194 385 390 395 400 

196 Gin Glu Arg Lys Arg Gin Leu Glu Leu Glu Lys Gin Leu Glu Lys Gin 

197 405 410 415 
199 Arg Glu Leu Glu Arg Gin Arg Glu Glu Glu Arg Arg Lys Glu He Glu 

O 200 420 425 430 

,i;p02 Arg Arg Glu Ala Ala Lys Arg Glu Leu Glu Arg Gin Arg Gin Leu Glu 

$03 435 440 445 

^|05 Trp Glu Arg Asn Arg Arg Gin Glu Leu Leu Asn Gin Arg Asn Lys Glu 

J*? 06 450 455 460 

^208 Gin Glu Asp He Val Val Leu Lys Ala Lys Lys Lys Thr Leu Glu Phe 

#09 465 470 475 480 

till Glu Leu Glu Ala Leu Asn Asp Lys Lys His Gin Leu Glu Gly Lys Leu 

M 12 485 490 495 

^214 Gin Asp He Arg Cys Arg Leu Thr Thr Gin Arg Gin Glu He Glu Ser 

£=? 15 5 °0 505 510 

g>17 Thr Asn Lys Ser Arg Glu Leu Arg He Ala Glu He Thr His Leu Gin 

p!8 515 520 525 

;f 20 Gin Gin Leu Gin Glu Ser Gin Gin Met Leu Gly Arg Leu He Pro Glu 

\\t21 530 535 540 

023 Lys Gin He Leu Asn Asp Gin Leu Lys Gin Val Gin Gin Asn Ser Leu 

N 24 545 550 555 560 

226 His Arg Asp Ser Leu Val Thr Leu Lys Arg Ala Leu Glu Ala Lys Glu 

227 565 570 5?5 

229 Leu Ala Arg Gin His Leu Arg Asp Gin Leu Asp Glu Val Glu Lys Glu 

23 0 580 585 590 

232 Thr Arg Ser Lys Leu Gin Glu He Asp He Phe Asn Asn Gin Leu Lys 

233 595 600 605 

235 Glu Leu Arg Glu He His Asn Lys Gin Gin Leu Gin Lys Gin Lys Ser 

236 610 615 620 

23 8 Met Glu Ala Glu Arg Leu Lys Gin Lys Glu Gin Glu Arg Lys He He 
239 625 630 635 640 

241 Glu Leu Glu Lys Gin Lys Glu Glu Ala Gin Arg Arg Ala Gin Glu Arg 

242 645 650 655 

244 Asp Lys Gin Trp Leu Glu His Val Gin Gin Glu Asp Glu His Gin Arg 

245 660 665 670 

247 Pro Arg Lys Leu His Glu Glu Glu Lys Leu Lys Arg Glu Glu Ser Val 

24 8 675 680 685 
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RAW SEQUENCE LISTING DATE : 11/14/2001 

PATENT APPLICATION: US/09/720,934 TIME: 14:05:24 

Input Set : A:\Sequence Listing.txt 
Output Set: N:\CRF3\11142001\I720934.raw 

250 Lys Lys Lys Asp Gly Glu Glu Lys Gly Lys Gin Glu Ala Gin Asp Lys 

251 690 695 700 

253 Leu Gly Arg Leu Phe His Gin His Gin Glu Pro Ala Lys Pro Ala Val 

254 705 710 715 720 

256 Gin Ala Pro Trp Ser Thr Ala Glu Lys Gly Pro Leu Thr lie Ser Ala 

257 725 730 735 

259 Gin Glu Asn Val Lys Val Val Tyr Tyr Arg Ala Leu Tyr Pro Phe Glu 

260 740 745 750 

2 62 Ser Arg Ser His Asp Glu lie Thr He Gin Pro Gly Asp He Val Met 
263 755 760 765 

265 Val Asp Glu Ser Gin Thr Gly Glu Pro Gly Trp Leu Gly Gly Glu Leu 

266 770 775 780 

268 Lys Gly Lys Thr Gly Trp Phe Pro Ala Asn Tyr Ala Glu Lys He Pro 

269 785 790 795 800 

271 Glu Asn Glu Val Pro Ala Pro Val Lys Pro Val Thr Asp Ser Thr Ser 

272 805 810 815 
f|74 Ala Pro Ala Pro Lys Leu Ala Leu Arg Glu Thr Pro Ala Pro Leu Ala 
y§75 820 825 830 

C|7 7 Val Thr Ser Ser Glu Pro Ser Thr Thr Pro Asn Asn Trp Ala Asp Phe 
fl |78 835 840 845 

J|80 Ser Ser Thr Trp Pro Thr Ser Thr Asn Glu Lys Pro Glu Thr Asp Asn 
M81 850 855 860 

\i283 Trp Asp Ala Trp Ala Ala Gin Pro Ser Leu Thr Val Pro Ser Ala Gly 
W84 865 870 875 880 

J|86 Gin Leu Arg Gin Arg Ser Ala Phe Thr Pro Ala Thr Ala Thr Gly Ser 
.287 885 890 895 

g|89 Ser Pro Ser Pro Val Leu Gly Gin Gly Glu Lys Val Glu Gly Leu Gin 
390- 900 905 910 

1292 Ala Gin Ala Leu Tyr Pro Trp Arg Ala Lys Lys Asp Asn His Leu Asn 
J?93 915 920 925 

95 Phe Asn Lys Asn Asp Val He Thr Val Leu Glu Gin Gin Asp Met Trp 
096 930 935 940 

1^98 Trp Phe Gly Glu Val Gin Gly Gin Lys Gly Trp Phe Pro Lys Ser Tyr 
299 945 950 955 960 

301 Val Lys Leu He Ser Gly Pro He Arg Lys Ser Thr Ser Met Asp Ser 

302 965 970 975 

3 04 Gly Ser Ser Glu Ser Pro Ala Ser Leu Lys Arg Val Ala Ser Pro Ala 
305 980 985 990 

307 Ala Lys Pro Val Val Ser Gly Glu Glu He Ala Gin Val He Ala Ser 

308 995 1000 1005 

310 Tyr Thr Ala Thr Gly Pro Glu Gin Leu Thr Leu Ala Pro Gly Gin Leu 

311 1010 1015 1020 

313 He Leu He Arg Lys Lys Asn Pro Gly Gly Trp Trp Glu Gly Glu Leu 

314 1025 1030 1035 1040 

316 Gin Ala Arg Gly Lys Lys Arg Gin He Gly Trp Phe Pro Ala Asn Tyr 

317 1045 1050 1055 

319 Val Lys Leu Leu Ser Pro Gly Thr Ser Lys He Thr Pro Thr Glu Pro 

320 1060 1065 1070 

322 Pro Lys Ser Thr Ala Leu Ala Ala Val Cys Gin Val He Gly Met Tyr 
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PATENT APPLICATION: US/09/720 , 934 TIME: 14:05:25 

Input Set : A:\Sequence Listing.txt 
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L:ll M:270 C: Current Application Number differs, Replaced Current Application Number 
L:12 M:271 C: Current Filing Date differs, Replaced Current Filing Date 
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<110> Korenberg, Julie R 
Chen, Xiao-Ning 

<12 0> ISOLATED SH3 GENES ASSOCIATED WITH MYELOPROLIFERATIVE 
DISORDERS AND LEUKEMIA, AND USES THEREOF 

<130> 2320-1-001PCT 

<140> PCT/US99/08371 
<141> 1999-04-16 

<150> 60/082,007 
<151> 1998-04-16 

<160> 109 

<170> Patentln Ver. 2.0 

<210> 1 

<211> 5199 

<212> DNA 

<213> Homo sapiens 

<400> 1 

caaaagaatt ccgggtacgg cggctcgcga ggaagaatcc cgagcgggct ccgggacgga 60 
cagagaggcg ggcggggatg gtgtgcgggg ctgcggctcc tgcgtccctc ccagcggcgc 12 0 
gtgagcggca ctgatttgtc cctggggcgg cagcgcggac ccgcccggag atgaggcgtc 180 
gattagcaag gtaaaagtaa cagaaccatg gctcagtttc caacaccttt tggtggcagc 24 0 
ctggatatct gggccataac tgtagaggaa agagcgaagc atgatcagca gttccatagt 3 00 
ttaaagccaa tatctggatt cattactggt gatcaagcta gaaacttttt ttttcaatct 360 
gggttacctc aacctgtttt agcacagata tgggcactag ctgacatgaa taatgatgga 420 
agaatggatc aagtggagtt ttccatagct atgaaactta tcaaactgaa gctacaagga 4 80 
tatcagctac cctctgcact tccccctgtc atgaaacagc aaccagttgc tatttctagc 540 
gcaccagcat ttggtatggg aggtatcgcc agcatgccac cgcttacagc tgttgctcca 600 
gtgccaatgg gatccattcc agttgttgga atgtctccaa ccctagtatc ttctgttccc 660 
acagcagctg tgccccccct ggctaacggg gctccccctg ttatacaacc tctgcctgca 72 0 
tttgctcatc ctgcagccac attgccaaag agttcttcct ttagtagatc tggtccaggg 780 
tcacaactaa acactaaatt acaaaaggca cagtcatttg atgtggccag tgtcccacca 840 
gtggcagagt gggctgttcc tcagtcatca agactgaaat acaggcaatt attcaatagt 900 
catgacaaaa ctatgagtgg acacttaaca ggtccccaag caagaactat tcttatgcag 960 
tcaagtttac cacaggctca gctggcttca atatggaatc tttctgacat tgatcaagat 102 0 
ggaaaactta cagcagagga atttatcctg gcaatgcacc tcattgatgt agctatgtct 1080 
ggccaaccac tgccacctgt cctgcctcca gaatacattc caccttcttt tagaagagtt 1140 
cgatctggca gtggtatatc tgtcataagc tcaacatctg tagatcagag gctaccagag 12 00 
gaaccagttt tagaagatga acaacaacaa ttagaaaaga aattacctgt aacgtttgaa 1260 
gataagaagc gggagaactt tgaacgtggc aacctggaac tggagaaacg aaggcaagct 132 0 
ctcctggaac agcagcgcaa ggagcaggag cgcctggccc agctggagcg ggcggagcag 1380 
gagaggaagg agcgtgagcg ccaggagcaa gagcgcaaaa gacaactgga actggagaag 144 0 
caactggaaa agcagcggga gctagaacgg cagagagagg aggagaggag gaaagaaatt 150 0 
gagaggcgag aggctgcaaa acgggaactt gaaaggcaac gacaacttga gtgggaacgg 1560 
aatcgaaggc aagaactact aaatcaaaga aacaaagaac aagaggacat agttgtactg 162 0 
aaagcaaaga aaaagacttt ggaatttgaa ttagaagctc taaatgataa aaagcatcaa 1680 
ctagaaggga aacttcaaga tatcagatgt cgattgacca cccaaaggca agaaattgag 1740 
agcacaaaca aatctagaga gttgagaatt gccgaaatca cccatctaca gcaacaatta 1800 
caggaatctc agcaaatgct tggaagactt attccagaaa aacagatact caatgaccaa 18 60 



ttaaaacaag ttcagcagaa cagtttgcac agagattcac ttgttacact taaaagagcc 192 0 
ttagaagcaa aagaactagc tcggcagcac ctacgagacc aactggatga agtggagaaa 1980 
gaaactagat caaaactaca ggagattgat attttcaata atcagctgaa ggaactaaga 2 04 0 
gaaatacaca ataagcaaca actccagaag caaaagtcca tggaggctga acgactgaaa 2100 
cagaaagaac aagaacgaaa gatcatagaa ttagaaaaac aaaaagaaga agcccaaaga 2160 
cgagctcagg aaagggacaa gcagtggctg gagcatgtgc agcaggagga cgagcatcag 2220 
agaccaagaa aactccacga agaggaaaaa ctgaaaaggg aggagagtgt caaaaagaag 22 80 
gatggcgagg aaaaaggcaa acaggaagca caagacaagc tgggtcggct tttccatcaa 234 0 
caccaagaac cagctaagcc agctgtccag gcaccctggt ccactgcaga aaaaggtcca 240 0 
cttaccattt ctgcacagga aaatgtaaaa gtggtgtatt accgggcact gtaccccttt 2460 
gaatccagaa gccatgatga aatcactatc cagccaggag acatagtcat ggtggatgaa 2520 
agccaaactg gagaacccgg ctggcttgga ggagaattaa aaggaaagac agggtggttc 2580 
cctgcaaact atgcagagaa aatcccagaa aatgaggttc ccgctccagt gaaaccagtg 2 64 0 
actgattcaa catctgcccc tgcccccaaa ctggccttgc gtgagacccc cgcccctttg 2700 
gcagtaacct cttcagagcc ctccacgacc cctaataact gggccgactt cagctccacg 2760 
tggcccacca gcacgaatga gaaaccagaa acggataact gggatgcatg ggcagcccag 2820 
ccctctctca ccgttccaag tgccggccag ttaaggcaga ggtccgcctt tactccagcc 2880 
acggccactg gctcctcccc gtctcctgtg ctaggccagg gtgaaaaggt ggaggggcta 2 94 0 
caagctcaag ccctatatcc ttggagagcc aaaaaagaca accacttaaa ttttaacaaa 3000 
aatgatgtca tcaccgtcct ggaacagcaa gacatgtggt ggtttggaga agttcaaggt 3 0 60 
cagaagggtt ggttccccaa gtcttacgtg aaactcattt cagggcccat aaggaagtct 3120 
acaagcatgg attctggttc ttcagagagt cctgctagtc taaagcgagt agcctctcca 3180 
gcagccaagc cggtcgtttc gggagaagaa attgcccagg ttattgcctc atacaccgcc 3240 
accggccccg agcagctcac tctcgcccct ggtcagctga ttttgatccg aaaaaagaac 3300 
ccaggtggat ggtgggaagg agagctgcaa gcacgtggga aaaagcgcca gataggctgg 3360 
ttcccagcta attatgtaaa gcttctaagc cctgggacga gcaaaatcac tccaacagag 342 0 
ccacctaagt caacagcatt agcggcagtg tgccaggtga ttgggatgta cgactacacc 3480 
gcgcagaatg acgatgagct ggccttcaac aagggccaga tcatcaacgt cctcaacaag 3 54 0 
gaggaccctg actggtggaa aggagaagtc aatggacaag tggggctctt cccatccaat 3 600 
tatgtgaagc tgaccacaga catggaccca agccagcaat gaatcatatg ttgtccatcc 3 660 
ccccctcagg cttgaaagtc ctcaaagaga cccactatcc catatcactg cccagaggga 3720 
tgatgggaga tgcagccttg atcatgtgac ttccagcatg atcacctact gccttctgag 3780 
tagaagaact cactgcagag cagtttacct cattttacct tagttgcatg tgatcgcaat 384 0 
gtttgagtta ttacttgcag agataggagc aaaaattaca aaaacacaca gggtagtggg 3900 
tccttttgtg gctttcctag ttactcaaat tgactttccc ccacctttgc acaggtgctt 3960 
tcaatagttt taaaattatt tttaaatata tattttagct ttttaataaa caaaataaat 4020 
aaatgacttc tttgctattt tggttttgca aaaagaccca ctatcaagga atgctgcatg 4080 
tgctattaaa aattgttcca aatgtccata aatctgagac ttgatgtatt ttttcatttt 4140 
gtccagtgtt accaactaaa ttgctgcagt ttggggcttt tcccccttac catagaagtg 42 0 0 
cagaggagtt cagtatctct gttttaaaga cgtatagaat gagcccaatt aaagcgaagg 4260 
tgattgtgct tgtttgtgtg tatcagctgt accttgttga gcatgtaata catcctgtac 432 0 
ataagaaatt agttctttcc atggcaaagc tattaccttg tacgatgctc taatcatatt 4380 
" gcatttaatt ttattttgca acagtgacct tgtagccaca tgagaaagca ctctgtgttt 444 0 
ttgttcggtc tcagatttat ctggttgagt tggtgttttg tttggggttt ttaattttgc 4500 
gtgtttgcat agcataaaat cagtagacaa caccactgag gtcgttacga tcaacgatat 4560 
ccacagtctc tttttagtct ctgttacatg aagttttatt ccagttactt ttcatggaat 462 0 
gacctatttt gaacaagtaa ttttcttgac aagaaagaat gtatagaagt ctccctgcaa 4680 
ttaatttcca atgtttacat tttttaacta ggactgtgga atttctacag attaatatga 4740 
aatggagctc atggtccgtt tgtgtgttag atatgctgta gctgaagccc tgtttgtctt 4800 
ttaaacacta gttggaagct ctcaataaaa atgcctgctg ctcacagcac agaaaatggg 4860 
gcagggggag cctcaagcac aatctagctg tcctcctaaa gactctgtaa tgctcaatcc 4920 
ccttgcgttc tcccggcgct gtcgggaggc tgtgctggtg gtcgtgtaga ggtccttttc 4980 
ctttcaaatg gtgcagagag agaggacctt tcctccttgt tcagttgcaa ttcagtattt 5040 
tcacggatat gaatgtaaaa tatataaata tataaacctg aggatttaac aaatgtaaaa 5100 
caaccttttg aattagttcc gagtatagat aattaaattt ttaaaacaaa agtaaaaaaa 5160 
aaaaaaaaaa aaaaaaaaaa aaaagtcgac gcggccgcg 5199 



<210> 2 
<211> 1143 
<212> PRT 

<213> Homo sapiens 



<400> 2 

Met Ala Gin Phe Pro Thr Pro Phe Gly Gly Ser Leu Asp lie Trp Ala 
15 10 15 

lie Thr Val Glu Glu Arg Ala Lys His Asp Gin Gin Phe His Ser Leu 
20 25 30 

Lys Pro lie Ser Gly Phe lie Thr Gly Asp Gin Ala Arg Asn Phe Phe 
35 40 45 

Phe Gin Ser Gly Leu Pro Gin Pro Val Leu Ala Gin lie Trp Ala Leu 
50 55 60 

Ala Asp Met Asn Asn Asp Gly Arg Met Asp Gin Val Glu Phe Ser lie 
65 70 75 80 

Ala Met Lys Leu lie Lys Leu Lys Leu Gin Gly Tyr Gin Leu Pro Ser 
85 90 95 

Ala Leu Pro Pro Val Met Lys Gin Gin Pro Val Ala lie Ser Ser Ala 
100 ' 105 110 

Pro Ala Phe Gly Met Gly Gly lie Ala Ser Met Pro Pro Leu Thr Ala 
115 120 125 

Val Ala Pro Val Pro Met Gly Ser lie Pro Val Val Gly Met Ser Pro 
130 135 140 

Thr Leu Val Ser Ser Val Pro Thr Ala Ala Val Pro Pro Leu Ala Asn 
145 150 155 160 

Gly Ala Pro Pro Val lie Gin Pro Leu Pro Ala Phe Ala His Pro Ala 
165 170 175 

Ala Thr Leu Pro Lys Ser Ser Ser Phe Ser Arg Ser Gly Pro Gly Ser 
180 185 190 

Gin Leu Asn Thr Lys Leu Gin Lys Ala Gin Ser Phe Asp Val Ala Ser 
195 200 205 

Val Pro Pro Val Ala Glu Trp Ala Val Pro Gin Ser Ser Arg Leu Lys 
210 215 220 

Tyr Arg Gin Leu Phe Asn Ser His Asp Lys Thr Met Ser Gly His Leu 
225 230 235 240 

Thr Gly Pro Gin Ala Arg Thr He Leu Met Gin Ser Ser Leu Pro Gin 
245 250 255 



Ala Gin Leu Ala Ser He Trp Asn Leu Ser Asp He Asp Gin Asp Gly 
260 265 270 



Lys Leu Thr Ala Glu Glu Phe He Leu Ala Met His Leu He Asp Val 
275 280 285 

Ala Met Ser Gly Gin Pro Leu Pro Pro Val Leu Pro Pro Glu Tyr He 
290 295 300 

Pro Pro Ser Phe Arg Arg Val Arg Ser Gly Ser Gly He Ser Val He 
305 310 315 320 

Ser Ser Thr Ser Val Asp Gin Arg Leu Pro Glu Glu Pro Val Leu Glu 
325 330 335 

Asp Glu Gin Gin Gin Leu Glu Lys Lys Leu Pro Val Thr Phe Glu Asp 
340 345 350 

Lys Lys Arg Glu Asn Phe Glu Arg Gly Asn Leu Glu Leu Glu Lys Arg 
355 360 365 

Arg Gin Ala Leu Leu Glu Gin Gin Arg Lys Glu Gin Glu Arg Leu Ala 
370 375 380 

Gin Leu Glu Arg Ala Glu Gin Glu Arg Lys Glu Arg Glu Arg Gin Glu 
385 390 395 400 

Gin Glu Arg Lys Arg Gin Leu Glu Leu Glu Lys Gin Leu Glu Lys Gin 
405 410 415 

Arg Glu Leu Glu Arg Gin Arg Glu Glu Glu Arg Arg Lys Glu He Glu 
420 425 430 

Arg Arg Glu Ala Ala Lys Arg Glu Leu Glu Arg Gin Arg Gin Leu Glu 
435 440 445 

Trp Glu Arg Asn Arg Arg Gin Glu Leu Leu Asn Gin Arg Asn Lys Glu 
450 455 460 

Gin Glu Asp He Val Val Leu Lys Ala Lys Lys Lys Thr Leu Glu Phe 
465 470 475 480 

Glu Leu Glu Ala Leu Asn Asp Lys Lys His Gin Leu Glu Gly Lys Leu 
485 490 495 

Gin Asp He Arg Cys Arg Leu Thr Thr Gin Arg Gin Glu He Glu Ser 
500 505 510 

Thr Asn Lys Ser Arg Glu Leu Arg He Ala Glu He Thr His Leu Gin 
515 520 525 

Gin Gin Leu Gin Glu Ser Gin Gin Met Leu Gly Arg Leu He Pro Glu 
530 535 540 

Lys Gin He Leu Asn Asp Gin Leu Lys Gin Val Gin Gin Asn Ser Leu 
545 550 555 560 



His Arg Asp Ser Leu Val Thr Leu Lys Arg Ala Leu Glu Ala Lys Glu 
565 570 575 



Leu Ala Arg Gin His Leu Arg Asp Gin Leu Asp Glu Val Glu Lys Glu 
580 585 590 



Thr Arg Ser Lys Leu Gin Glu lie Asp lie Phe Asn Asn Gin Leu Lys 
595 600 605 

Glu Leu Arg Glu lie His Asn Lys Gin Gin Leu Gin Lys Gin Lys Ser 
610 615 620 

Met Glu Ala Glu Arg Leu Lys Gin Lys Glu Gin Glu Arg Lys lie lie 
625 630 635 640 

Glu Leu Glu Lys Gin Lys Glu Glu Ala Gin Arg Arg Ala Gin Glu Arg 
645 650 655 

Asp Lys Gin Trp Leu Glu His Val Gin Gin Glu Asp Glu His Gin Arg 
660 665 670 

Pro Arg Lys Leu His Glu Glu Glu Lys Leu Lys Arg Glu Glu Ser Val 
675 680 685 

Lys Lys Lys Asp Gly Glu Glu Lys Gly Lys Gin Glu Ala Gin Asp Lys 
690 695 700 

Leu Gly Arg Leu Phe His Gin His Gin Glu Pro Ala Lys Pro Ala Val 
705 710 715 720 

Gin Ala Pro Trp Ser Thr Ala Glu Lys Gly Pro Leu Thr lie Ser Ala 
725 730 735 

Gin Glu Asn Val Lys Val Val Tyr Tyr Arg Ala Leu Tyr Pro Phe Glu 
740 745 750 

Ser Arg Ser His Asp Glu lie Thr lie Gin Pro Gly Asp lie Val Met 
755 760 765 

Val Asp Glu Ser Gin Thr Gly Glu Pro Gly Trp Leu Gly Gly Glu Leu 
770 775 780 

Lys Gly Lys Thr Gly Trp Phe Pro Ala Asn Tyr Ala Glu Lys lie Pro 
785 790 795 800 

Glu Asn Glu Val Pro Ala Pro Val Lys Pro Val Thr Asp Ser Thr Ser 
805 810 815 

Ala Pro Ala Pro Lys Leu Ala Leu Arg Glu Thr Pro Ala Pro Leu Ala 
820 825 830 

Val Thr Ser Ser Glu Pro Ser Thr Thr Pro Asn Asn Trp Ala Asp Phe 
835 840 845 

Ser Ser Thr Trp Pro Thr Ser Thr Asn Glu Lys Pro Glu Thr Asp Asn 
850 855 860 



Trp Asp Ala Trp Ala Ala Gin Pro Ser Leu Thr Val Pro Ser Ala Gly 
865 870 875 880 



Gin Leu Arg Gin Arg Ser Ala Phe Thr Pro Ala Thr Ala Thr Gly Ser 
885 890 895 



Ser Pro Ser Pro Val Leu Gly Gin Gly Glu Lys Val Glu Gly Leu Gin 
900 905 910 

Ala Gin Ala Leu Tyr Pro Trp Arg Ala Lys Lys Asp Asn His Leu Asn 
915 920 925 

Phe Asn Lys Asn Asp Val lie Thr Val Leu Glu Gin Gin Asp Met Trp 
930 935 940 

Trp Phe Gly Glu Val Gin Gly Gin Lys Gly Trp Phe Pro Lys Ser Tyr 
945 950 955 960 

Val Lys Leu lie Ser Gly Pro lie Arg Lys Ser Thr Ser Met Asp Ser 
965 970 975 

Gly Ser Ser Glu Ser Pro Ala Ser Leu Lys Arg Val Ala Ser Pro Ala 
980 985 990 

Ala Lys Pro Val Val Ser Gly Glu Glu He Ala Gin Val He Ala Ser 
995 1000 1005 

Tyr Thr Ala Thr Gly Pro Glu Gin Leu Thr Leu Ala Pro Gly Gin Leu 
1010 1015 1020 

He Leu He Arg Lys Lys Asn Pro Gly Gly Trp Trp Glu Gly Glu Leu 
1025 1030 1035 1040 

Gin Ala Arg Gly Lys Lys Arg Gin He Gly Trp Phe Pro Ala Asn Tyr 
1045 1050 1055 

Val Lys Leu Leu Ser Pro Gly Thr Ser Lys He Thr Pro Thr Glu Pro 
1060 1065 1070 

Pro Lys Ser Thr Ala Leu Ala Ala Val Cys Gin Val He Gly Met Tyr 
1075 1080 1085 

Asp Tyr Thr Ala Gin Asn Asp Asp Glu Leu Ala Phe Asn Lys Gly Gin 
1090 1095 1100 

He He Asn Val Leu Asn Lys Glu Asp Pro Asp Trp Trp Lys Gly Glu 
1105 1110 1115 1120 

Val Asn Gly Gin Val Gly Leu Phe Pro Ser Asn Tyr Val Lys Leu Thr 
1125 1130 1135 

Thr Asp Met Asp Pro Ser Gin 
1140 



<210> 3 

<211> 5458 

<212> DNA 

<213> Homo sapiens 



gcacgagagg gagcgaagga ggtagagaag agtggaggcg ccaggggagg gagcgtagct 60 
tggttgctcc gtagtacggc ggctcgcgag gaagaatccc gagcgggctc cgggacggac 12 0 
agagaggcgg gcggggatgg tgtgcggggc tgcggctcct gcgtccctcc cagcggcgcg 180 
tgagcggcac tgatttgtcc ctggggcggc agcgcggacc cgcccggaga tgaggcgtcg 240 
attagcaagg taaaagtaac agaaccatgg ctcagtttcc aacacctttt ggtggcagcc 3 00 
tggatatctg ggccataact gtagaggaaa gagcgaagca tgatcagcag ttccatagtt 3 60 
taaagccaat atctggattc attactggtg atcaagctag aaactttttt tttcaatctg 420 
ggttacctca acctgtttta gcacagatat gggcactagc tgacatgaat aatgatggaa 4 80 
gaatggatca agtggagttt tccatagcta tgaaacttat caaactgaag ctacaaggat 540 
atcagctacc ctctgcactt ccccctgtca tgaaacagca accagttgct atttctagcg 600 
caccagcatt tggtatggga ggtatcgcca gcatgccacc gcttacagct gttgctccag 660 
tgccaatggg atccattcca gttgttggaa tgtctccaac cctagtatct tctgttccca 720 
cagcagctgt gccccccctg gctaacgggg ctccccctgt tatacaacct ctgcctgcat 780 
ttgctcatcc tgcagccaca ttgccaaaga gttcttcctt tagtagatct ggtccagggt 84 0 
cacaactaaa cactaaatta caaaaggcac agtcatttga tgtggccagt gtcccaccag 900 
tggcagagtg ggctgttcct cagtcatcaa gactgaaata caggcaatta ttcaatagtc 960 
atgacaaaac tatgagtgga cacttaacag gtccccaagc aagaactatt cttatgcagt 102 0 
caagtttacc acaggctcag ctggcttcaa tatggaatct ttctgacatt gatcaagatg 1080 
gaaaacttac agcagaggaa tttatcctgg caatgcacct cattgatgta gctatgtctg 114 0 
gccaaccact gccacctgtc ctgcctccag aatacattcc accttctttt agaagagttc 1200 
gatctggcag tggtatatct gtcataagct caacatctgt agatcagagg ctaccagagg 12 60 
aaccagtttt agaagatgaa caacaacaat tagaaaagaa attacctgta acgtttgaag 132 0 
ataagaagcg ggagaacttt gaacgtggca acctggaact ggagaaacga aggcaagctc 13 80 
tcctggaaca gcagcgcaag gagcaggagc gcctggccca gctggagcgg gcggagcagg 1440 
agaggaagga gcgtgagcgc caggagcaag agcgcaaaag acaactggaa ctggagaagc 1500 
aactggaaaa gcagcgggag ctagaacggc agagagagga ggagaggagg aaagaaattg 1560 
agaggcgaga ggctgcaaaa cgggaacttg aaaggcaacg acaacttgag tgggaacgga 162 0 
atcgaaggca agaactacta aatcaaagaa acaaagaaca agaggacata gttgtactga 1680 
aagcaaagaa aaagactttg gaatttgaat tagaagctct aaatgataaa aagcatcaac 174 0 
tagaagggaa acttcaagat atcagatgtc gattgaccac ccaaaggcaa gaaattgaga 1800 
gcacaaacaa atctagagag ttgagaattg ccgaaatcac ccatctacag caacaattac 1860 
aggaatctca gcaaatgctt ggaagactta ttccagaaaa acagatactc aatgaccaat 192 0 
taaaacaagt tcagcagaac agtttgcaca gagattcact tgttacactt aaaagagcct 1980 
tagaagcaaa agaactagct cggcagcacc tacgagacca actggatgaa gtggagaaag 2 040 
aaact agate aaaactacag gagattgata ttttcaataa tcagctgaag gaactaagag 2100 
aaatacacaa taagcaacaa ctccagaagc aaaagtccat ggaggctgaa cgactgaaac 2160 
agaaagaaca agaacgaaag atcatagaat tagaaaaaca aaaagaagaa gcccaaagac 222 0 
gagctcagga aagggacaag cagtggctgg agcatgtgca gcaggaggac gagcatcaga 2280 
gaccaagaaa actccacgaa gaggaaaaac tgaaaaggga ggagagtgtc aaaaagaagg 2340 
atggcgagga aaaaggcaaa caggaagcac aagacaagct gggteggett ttccatcaac 24 00 
accaagaacc agetaageca gctgtccagg caccctggtc cactgeagaa aaaggtccac 2460 
ttaccatttc tgcacaggaa aatgtaaaag tggtgtatta ccgggcactg tacccctttg 252 0 
aatccagaag ccatgatgaa atcactatcc agecaggaga catagfrcatg gttaaagggg 2580 
aatgggtgga tgaaagccaa actggagaac ccggctggct tggaggagaa ttaaaaggaa 264 0 
agacagggtg gttccctgca aactatgeag agaaaatccc agaaaatgag gttcccgctc 2700 
cagtgaaacc agtgactgat tcaacatctg cccctgcccc caaactggcc ttgcgtgaga 2 760 
cccccgcccc tttggcagta acctcttcag agccctccac gacccctaat aactgggccg 2820 
acttcagctc cacgtggccc accagcacga atgagaaacc agaaaeggat aactgggatg 2880 
catgggcagc ccagccctct ctcaccgttc caagtgeegg ccagttaagg cagaggtccg 2 94 0 
cctttactcc agccacggcc actggctcct ccccgtctcc tgtgctaggc cagggtgaaa 3000 
aggtggaggg gctacaagct caagccctat atccttggag agecaaaaaa gacaaccact 3060 
taaattttaa caaaaatgat gtcatcaccg tcctggaaca gcaagacatg tggtggtttg 312 0 
gagaagttca aggtcagaag ggttggttcc ccaagtctta cgtgaaactc atttcagggc 3180 
ccataaggaa gtctacaagc atggattctg gttcttcaga gagtcctget agtctaaagc 3240 
gagtagcetc tccagcagcc aagceggteg tttegggaga agaatttatt gecatgtaca 3300 
cttacgagag ttctgagcaa ggagatttaa cctttcagca aggggatgtg attttggtta 3360 



ccaagaaaga tggtgactgg tggacaggaa cagtgggcga caaggccgga gtcttccctt 342 0 
ctaactatgt gaggcttaaa gattcagagg gctctggaac tgctgggaaa acagggagtt 3480 
taggaaaaaa acctgaaatt gcccaggtta ttgcctcata caccgccacc ggccccgagc 3540 
agctcactct cgcccctggt cagctgattt tgatccgaaa aaagaaccca ggtggatggt 3600 
gggaaggaga gctgcaagca cgtgggaaaa agcgccagat aggctggttc ccagctaatt 3 660 
atgtaaagct tctaagccct gggacgagca aaatcactcc aacagagcca cctaagtcaa 372 0 
cagcattagc ggcagtgtgc caggtgattg ggatgtacga ctacaccgcg cagaatgacg 3780 
atgagctggc cttcaacaag ggccagatca tcaacgtcct caacaaggag gaccctgact 3 840 
ggtggaaagg agaagtcaat ggacaagtgg ggctcttccc atccaattat gtgaagctga 3 900 
ccacagacat ggacccaagc cagcaatgaa tcatatgttg tccatccccc cctcaggctt 3 960 
gaaagtcctc aaagagaccc actatcccat atcactgccc agagggatga tgggagatgc 4 02 0 
agccttgatc atgtgacttc cagcatgatc acctactgcc ttctgagtag aagaactcac 4080 
tgcagagcag tttacctcat tttaccttag ttgcatgtga tcgcaatgtt tgagttatta 4140 
cttgcagaga taggagcaaa aattacaaaa acacacaggg tagtgggtcc ttttgtggct 42 00 
ttcctagtta ctcaaattga ctttccccca cctttgcaca ggtgctttca atagttttaa 4260 
aattattttt aaatatatat tttagctttt taataaacaa aataaataaa tgacttcttt 4320 
gctattttgg ttttgcaaaa agacccacta tcaaggaatg ctgcatgtgc tattaaaaat 4380 
tgttccaaat gtccataaat ctgagacttg atgtattttt tcattttgtc cagtgttacc 444 0 
aactaaattg tgcagtttgg ggcttttccc ccttaccata gaagtgcaga ggagttcagt 4500 
atctctgttt taaagacgta tagaatgagc ccaattaaag cgaaggtgtt tgtgcttgtt 4560 
tgtgtgtatc agctgtacct tgttgagcat gtaatacatc ctgtacataa gaaattagtt 4620 
ctttccatgg caaagctatt accttgtacg atgctctaat catattgcat ttaattttat 4680 
tttgcacagt gaccttgtag ccacatgaga aagcactctg tgtttttgtt cggtctcaga 4740 
tttatctggt tgagttggtg ttttgtttgg ggtttttaat tttgcgtgtt tgcatagcat 4800 
aaaatcagta gacaacacca ctgaggtcgt tacgatcaac gatatccaca gtctcttttt 4860 
agtctctgtt acatgaagtt ttattccagt tacttttcat ggaatgacct attttgaaca 4920 
agtaattttc ttgacaagaa agaatgtata gaagtctccc tgcaattaat ttccaatgtt 4980 
tacatttttt aactagactg tggaatttct acagattaat atgaaatgga gctcatggtc 5040 
cgtttgtgtg ttagatatgc tgtagctgaa gccctgtttg tcttttaaac actagttgga 5100 
agctctcaat aaaaatgcct gctgctcaca gcacagaaaa tggggcaggg ggagcctcaa 5160 
gcacaatcta gctgtcctcc taaagactct gtaatgctca ctcccctcgc gttctcccgg 5220 
cgctgtcggg aggctgtgct ggtggtcgtg tagaggtcct tctcctttca catggtgcag 52 80 
agagcgagga cctctcctcc tcgttcagtt gcacttcagt attttcacgg atatgaatgt 5340 
aaaatatata aatatataaa cctgcggctt taacaactgt aatacaacct tttgaattag 5400 
ttccgtgtat agataattaa attcttcata caaaagttaa aaaaaaaaaa aaaaaaaa 5458 

<210> 4 
<211> 1220 
<212> PRT 

<213> Homo sapiens 
<400> 4 

Met Ala Gin Phe Pro Thr Pro Phe Gly Gly Ser Leu Asp lie Trp Ala 
15 10 15 

lie Thr Val Glu Glu Arg Ala Lys His Asp Gin Gin Phe His Ser Leu 
20 25 30 

Lys Pro lie Ser Gly Phe lie Thr Gly Asp Gin Ala Arg Asn Phe Phe 
35 40 45 

Phe Gin Ser Gly Leu Pro Gin Pro Val Leu Ala Gin lie Trp Ala Leu 
50 55 60 



Ala Asp Met Asn Asn Asp Gly Arg Met Asp Gin Val Glu Phe Ser lie 
65 70 75 80 



Ala Met Lys Leu lie Lys Leu Lys Leu Gin Gly Tyr Gin Leu Pro Ser 
85 90 95 



Ala Leu Pro Pro Val Met Lys Gin Gin Pro Val Ala lie Ser Ser Ala 

100 105 110 

Pro Ala Phe Gly Met Gly Gly lie Ala Ser Met Pro Pro Leu Thr Ala 
115 120 125 

Val Ala Pro Val Pro Met Gly Ser lie Pro Val Val Gly Met Ser Pro 
130 135 140 

Thr Leu Val Ser Ser Val Pro Thr Ala Ala Val Pro Pro Leu Ala Asn 
145 150 155 160 

Gly Ala Pro Pro Val lie Gin Pro Leu Pro Ala Phe Ala His Pro Ala 
165 170 175 

Ala Thr Leu Pro Lys Ser Ser Ser Phe Ser Arg Ser Gly Pro Gly Ser 
180 185 190 

Gin Leu Asn Thr Lys Leu Gin Lys Ala Gin Ser Phe Asp Val Ala Ser 
195 200 205 

Val Pro Pro Val Ala Glu Trp Ala Val Pro Gin Ser Ser Arg Leu Lys 
210 215 220 

Tyr Arg Gin Leu Phe Asn Ser His Asp Lys Thr Met Ser Gly His Leu 
225 230 235 240 

Thr Gly Pro Gin Ala Arg Thr lie Leu Met Gin Ser Ser Leu Pro Gin 
245 250 255 

Ala Gin Leu Ala Ser lie Trp Asn Leu Ser Asp lie Asp Gin Asp Gly 
260 265 270 

Lys Leu Thr Ala Glu Glu Phe lie Leu Ala Met His Leu lie Asp Val 
275 280 285 

Ala Met Ser Gly Gin Pro Leu Pro Pro Val Leu Pro Pro Glu Tyr lie 
290 295 300 

Pro Pro Ser Phe Arg Arg Val Arg Ser Gly Ser Gly lie Ser Val lie 
305 310 315 320 

Ser Ser Thr Ser Val Asp Gin Arg Leu Pro Glu Glu Pro Val Leu Glu 
325 330 335 

Asp Glu Gin Gin Gin Leu Glu Lys Lys Leu Pro Val Thr Phe Glu Asp 
340 345 350 

Lys Lys Arg Glu Asn Phe Glu Arg Gly Asn Leu Glu Leu Glu Lys Arg 
355 360 365 



Arg Gin Ala Leu Leu Glu Gin Gin Arg Lys Glu Gin Glu Arg Leu Ala 
370 375 380 



Gin Leu Glu Arg Ala Glu Gin Glu Arg Lys Glu Arg Glu Arg Gin Glu 
385 390 395 400 

Gin Glu Arg Lys Arg Gin Leu Glu Leu Glu Lys Gin Leu Glu Lys Gin 

405 410 415 

Arg Glu Leu Glu Arg Gin Arg Glu Glu Glu Arg Arg Lys Glu lie Glu 
420 425 430 

Arg Arg Glu Ala Ala Lys Arg Glu Leu Glu Arg Gin Arg Gin Leu Glu 
435 440 445 

Trp Glu Arg Asn Arg Arg Gin Glu Leu Leu Asn Gin Arg Asn Lys Glu 
450 455 460 

Gin Glu Asp lie Val Val Leu Lys Ala Lys Lys Lys Thr Leu Glu Phe 
465 470 475 480 

Glu Leu Glu Ala Leu Asn Asp Lys Lys His Gin Leu Glu Gly Lys Leu 
485 490 495 

Gin Asp lie Arg Cys Arg Leu Thr Thr Gin Arg Gin Glu lie Glu Ser 
500 505 510 

Thr Asn Lys Ser Arg Glu Leu Arg lie Ala Glu lie Thr His Leu Gin 
515 520 525 

Gin Gin Leu Gin Glu Ser Gin Gin Met Leu Gly Arg Leu lie Pro Glu 
530 535 540 

Lys Gin lie Leu Asn Asp Gin Leu Lys Gin Val Gin Gin Asn Ser Leu 
545 550 555 560 

His Arg Asp Ser Leu Val Thr Leu Lys Arg Ala Leu Glu Ala Lys Glu 
565 570 575 

Leu Ala Arg Gin His Leu Arg Asp Gin Leu Asp Glu Val Glu Lys Glu 
580 585 590 

Thr Arg Ser Lys Leu Gin Glu lie Asp lie Phe Asn Asn Gin Leu Lys 
595 600 605 

Glu Leu Arg Glu lie His Asn Lys Gin Gin Leu Gin Lys Gin Lys Ser 
610 615 620 

Met Glu Ala Glu Arg Leu Lys Gin Lys Glu Gin Glu Arg Lys lie lie 
625 630 635 640 

Glu Leu Glu Lys Gin Lys Glu Glu Ala Gin Arg Arg Ala Gin Glu Arg 
645 650 655 

Asp Lys Gin Trp Leu Glu His Val Gin Gin Glu Asp Glu His Gin Arg 
660 665 670 

Pro Arg Lys Leu His Glu Glu Glu Lys Leu Lys Arg Glu Glu Ser Val 
675 680 685 



Lys Lys Lys Asp Gly Glu Glu Lys Gly Lys Gin Glu Ala Gin Asp Lys 
690 695 700 



Leu Gly Arg Leu Phe His Gin His Gin Glu Pro Ala Lys Pro Ala Val 

705 710 715 720 

Gin Ala Pro Trp Ser Thr Ala Glu Lys Gly Pro Leu Thr lie Ser Ala 
725 730 735 

Gin Glu Asn Val Lys Val Val Tyr Tyr Arg Ala Leu Tyr Pro Phe Glu 
740 745 750 

Ser Arg Ser His Asp Glu lie Thr lie Gin Pro Gly Asp lie Val Met 
755 760 765 

Val Lys Gly Glu Trp Val Asp Glu Ser Gin Thr Gly Glu Pro Gly Trp 
770 775 780 

Leu Gly Gly Glu Leu Lys Gly Lys Thr Gly Trp Phe Pro Ala Asn Tyr 
785 790 795 800 

Ala Glu Lys lie Pro Glu Asn Glu Val Pro Ala Pro Val Lys Pro Val 
805 810 815 

Thr Asp Ser Thr Ser Ala Pro Ala Pro Lys Leu Ala Leu Arg Glu Thr 
820 825 830 

Pro Ala Pro Leu Ala Val Thr Ser Ser Glu Pro Ser Thr Thr Pro Asn 
835 840 845 



Asn Trp Ala Asp Phe Ser Ser Thr 
850 855 

Pro Glu Thr Asp Asn Trp Asp Ala 
865 870 



Trp Pro Thr Ser Thr Asn Glu Lys 
860 

Trp Ala Ala Gin Pro Ser Leu Thr 
875 880 



Val Pro Ser Ala Gly Gin Leu Arg Gin Arg Ser Ala Phe Thr Pro Ala 

885 890 895 

Thr Ala Thr Gly Ser Ser Pro Ser Pro Val Leu Gly Gin Gly Glu Lys 
900 905 910 

Val Glu Gly Leu Gin Ala Gin Ala Leu Tyr Pro Trp Arg Ala Lys Lys 
915 920 925 

Asp Asn His Leu Asn Phe Asn Lys Asn Asp Val lie Thr Val Leu Glu 
930 935 940 

Gin Gin Asp Met Trp Trp Phe Gly Glu Val Gin Gly Gin Lys Gly Trp 
945 950 955 960 

Phe Pro Lys Ser Tyr Val Lys Leu lie Ser Gly Pro lie Arg Lys Ser 

965 970 975 



Thr Ser Met Asp Ser Gly Ser Ser Glu Ser Pro Ala Ser Leu Lys Arg 
980 985 990 



Val Ala Ser Pro Ala Ala Lys Pro Val Val Ser Gly Glu Glu Phe lie 
995 1000 1005 

Ala Met Tyr Thr Tyr Glu Ser Ser Glu Gin Gly Asp Leu Thr Phe Gin 
1010 1015 1020 

Gin Gly Asp Val lie Leu Val Thr Lys Lys Asp Gly Asp Trp Trp Thr 
1025 1030 1035 1040 

Gly Thr Val Gly Asp Lys Ala Gly Val Phe Pro Ser Asn Tyr Val Arg 
1045 1050 1055 

Leu Lys Asp Ser Glu Gly Ser Gly Thr Ala Gly Lys Thr Gly Ser Leu 
1060 1065 1070 

Gly Lys Lys Pro Glu lie Ala Gin Val lie Ala Ser Tyr Thr Ala Thr 
1075 1080 1085 

Gly Pro Glu Gin Leu Thr Leu Ala Pro Gly Gin Leu lie Leu lie Arg 
1090 1095 1100 



Lys Lys Asn Pro Gly Gly Trp Trp Glu Gly Glu Leu Gin Ala Arg Gly 
1105 1110 1115 1120 

%4 Lys Lys Arg Gin lie Gly Trp Phe Pro Ala Asn Tyr Val Lys Leu Leu 
pj 1125 1130 1135 

f 1 

,1*1 Ser Pro Gly Thr Ser Lys lie Thr Pro Thr Glu Pro Pro Lys Ser Thr 
1140 1145 1150 

Ala Leu Ala Ala Val Cys Gin Val lie Gly Met Tyr Asp Tyr Thr Ala 
H 1155 1160 1165 

Gin Asn Asp Asp Glu Leu Ala Phe Asn Lys Gly Gin lie He Asn Val 
0 1170 1175 1180 

p Leu Asn Lys Glu Asp Pro Asp Trp Trp Lys Gly Glu Val Asn Gly Gin 
1185 1190 1195 1200 



Val Gly Leu Phe Pro Ser Asn Tyr Val Lys Leu Thr Thr Asp Met Asp 
1205 1210 1215 



Pro Ser Gin Gin 
1220 



<210> 5 
<211> 23 
<212> PRT 

<213> Homo sapiens 
<220> 

<22 3> From Seq ID 5 to ID 38, there are 34 pretein 

sequences translated from Seq ID No. 3. Together, 
they form the whole protein sequence. 



<400> 5 



Thr Arg Gly Ser Glu Gly Gly Arg Glu Glu Trp Arg Arg Gin Gly Arg 
15 10 15 



Glu Arg Ser Leu Val Ala Pro 
20 



<210> 6 
<211> 52 
<212> PRT 

<213> Homo sapiens 
<400> 6 

Tyr Gly Gly Ser Arg Gly Arg lie Pro Ser Gly Leu Arg Asp Gly Gin 
15 10 15 

Arg Gly Gly Arg Gly Trp Cys Ala Gly Leu Arg Leu Leu Arg Pro Ser 
20 25 30 

Gin Arg Arg Val Ser Gly Thr Asp Leu Ser Leu Gly Arg Gin Arg Gly 
35 40 45 

Pro Ala Arg Arg 
50 



<210> 7 
<211> 3 
<212> PRT 

<213> Homo sapiens 

<400> 7 
Gly Val Asp 
1 



<210> 8 
<211> 1227 
<212> PRT 

<213> Homo sapiens 



<400> 8 

Gin Gly Lys Ser 
1 

Gly Ser Leu Asp 
20 

Asp Gin Gin Phe 
35 

Asp Gin Ala Arg 
50 

Leu Ala Gin lie 
65 



Asn Arg Thr Met 
5 

He Trp Ala He 

His Ser Leu Lys 
40 

Asn Phe Phe Phe 
55 

Trp Ala Leu Ala 
70 



Ala Gin Phe Pro 
10 

Thr Val Glu Glu 
25 

Pro He Ser Gly 



Gin Ser Gly Leu 
60 

Asp Met Asn Asn 
75 



Thr Pro Phe Gly 
15 

Arg Ala Lys His 
30 

Phe He Thr Gly 
45 

Pro Gin Pro Val 



Asp Gly Arg Met 
80 



Asp Gin Val Glu Phe Ser lie Ala Met Lys Leu lie Lys Leu Lys Leu 
85 90 95 



Gin Gly Tyr Gin Leu Pro Ser Ala Leu Pro Pro Val Met Lys Gin Gin 
100 105 110 

Pro Val Ala He Ser Ser Ala Pro Ala Phe Gly Met Gly Gly He Ala 
115 120 125 

Ser Met Pro Pro Leu Thr Ala Val Ala Pro Val Pro Met Gly Ser He 
130 135 140 

Pro Val Val Gly Met Ser Pro Thr Leu Val Ser Ser Val Pro Thr Ala 
145 150 155 160 

Ala Val Pro Pro Leu Ala Asn Gly Ala Pro Pro Val He Gin Pro Leu 
165 170 175 

Pro Ala Phe Ala His Pro Ala Ala Thr Leu Pro Lys Ser Ser Ser Phe 
180 185 190 

Ser Arg Ser Gly Pro Gly Ser Gin Leu Asn Thr Lys Leu Gin Lys Ala 
C3 195 200 205 

%J Gin Ser Phe Asp Val Ala Ser Val Pro Pro Val Ala Glu Trp Ala Val 
fyj 210 215 220 

p 

^ Pro Gin Ser Ser Arg Leu Lys Tyr Arg Gin Leu Phe Asn Ser His Asp 
225 230 235 240 

Lys Thr Met Ser Gly His Leu Thr Gly Pro Gin Ala Arg Thr He Leu 
s 245 250 255 

H Met Gin Ser Ser Leu Pro Gin Ala Gin Leu Ala Ser He Trp Asn Leu 
0 260 265 270 

FU 

fi Ser Asp He Asp Gin Asp Gly Lys Leu Thr Ala Glu Glu Phe He Leu 
T J 275 280 285 

Ala Met His Leu He Asp Val Ala Met Ser Gly Gin Pro Leu Pro Pro 
290 295 300 

Val Leu Pro Pro Glu Tyr He Pro Pro Ser Phe Arg Arg Val Arg Ser 
305 310 315 320 

Gly Ser Gly He Ser Val He Ser Ser Thr Ser Val Asp Gin Arg Leu 
325 330 335 



Pro Glu Glu Pro Val Leu Glu Asp 
340 

Leu Pro Val Thr Phe Glu Asp Lys 
355 360 



Glu Gin Gin Gin Leu Glu Lys Lys 
345 350 

Lys Arg Glu Asn Phe Glu Arg Gly 
365 



Asn Leu Glu Leu Glu Lys Arg Arg Gin Ala Leu Leu Glu Gin Gin Arg 
370 375 380 



Lys Glu Gin Glu Arg Leu Ala Gin Leu Glu Arg Ala Glu Gin Glu Arg 
385 390 395 400 



Lys Glu Arg Glu Arg Gin Glu Gin Glu Arg Lys Arg Gin Leu Glu Leu 
405 410 415 

Glu Lys Gin Leu Glu Lys Gin Arg Glu Leu Glu Arg Gin Arg Glu Glu 
420 425 430 

Glu Arg Arg Lys Glu lie Glu Arg Arg Glu Ala Ala Lys Arg Glu Leu 
435 440 445 

Glu Arg Gin Arg Gin Leu Glu Trp Glu Arg Asn Arg Arg Gin Glu Leu 
450 455 460 

Leu Asn Gin Arg Asn Lys Glu Gin Glu Asp lie Val Val Leu Lys Ala 
465 470 475 480 

Lys Lys Lys Thr Leu Glu Phe Glu Leu Glu Ala Leu Asn Asp Lys Lys 
485 490 495 

His Gin Leu Glu Gly Lys Leu Gin Asp lie Arg Cys Arg Leu Thr Thr 
500 505 510 

Gin Arg Gin Glu lie Glu Ser Thr Asn Lys Ser Arg Glu Leu Arg lie 
515 520 525 

Ala Glu lie Thr His Leu Gin Gin Gin Leu Gin Glu Ser Gin Gin Met 
530 535 540 

Leu Gly Arg Leu lie Pro Glu Lys Gin lie Leu Asn Asp Gin Leu Lys 
545 550 555 560 

Gin Val Gin Gin Asn Ser Leu His Arg Asp Ser Leu Val Thr Leu Lys 
565 570 575 

Arg Ala Leu Glu Ala Lys Glu Leu Ala Arg Gin His Leu Arg Asp Gin 
580 585 590 

Leu Asp Glu Val Glu Lys Glu Thr Arg Ser Lys Leu Gin Glu lie Asp 
595 600 605 

lie Phe Asn Asn Gin Leu Lys Glu Leu Arg Glu lie His Asn Lys Gin 
610 615 620 

Gin Leu Gin Lys Gin Lys Ser Met Glu Ala Glu Arg Leu Lys Gin Lys 
625 630 635 640 

Glu Gin Glu Arg Lys lie lie Glu Leu Glu Lys Gin Lys Glu Glu Ala 
645 650 655 



Gin Arg Arg Ala Gin Glu Arg Asp Lys Gin Trp Leu Glu His Val Gin 
660 665 670 



Gin Glu Asp Glu His Gin Arg Pro Arg Lys Leu His Glu Glu Glu Lys 
675 680 685 



Leu Lys Arg Glu 
690 

Lys Gin Glu Ala 
705 

Glu Pro Ala Lys 



Gly Pro Leu Thr 
740 

Arg Ala Leu Tyr 
755 

Gin Pro Gly Asp 
770 



Glu Ser Val Lys 
695 

Gin Asp Lys Leu 

710 

Pro Ala Val Gin 
725 

lie Ser Ala Gin 



Pro Phe Glu Ser 
760 

He Val Met Val 
775 



Lys Lys Asp Gly 
700 

Gly Arg Leu Phe 

715 

Ala Pro Trp Ser 
730 

Glu Asn Val Lys 
745 

Arg Ser His Asp 



Lys Gly Glu Trp 
780 



Glu Glu Lys Gly 



His Gin His Gin 

720 

Thr Ala Glu Lys 
735 

Val Val Tyr Tyr 
750 

Glu He Thr He 
765 

Val Asp Glu Ser 



Gin Thr Gly Glu Pro Gly Trp Leu Gly Gly Glu Leu Lys Gly Lys Thr 
785 790 795 800 

Gly Trp Phe Pro Ala Asn Tyr Ala Glu Lys He Pro Glu Asn Glu Val 
805 810 815 

Pro Ala Pro Val Lys Pro Val Thr Asp Ser Thr Ser Ala Pro Ala Pro 
820 825 830 

Lys Leu Ala Leu Arg Glu Thr Pro Ala Pro Leu Ala Val Thr Ser Ser 
835 840 845 

Glu Pro Ser Thr Thr Pro Asn Asn Trp Ala Asp Phe Ser Ser Thr Trp 
850 855 860 

Pro Thr Ser Thr Asn Glu Lys Pro Glu Thr Asp Asn Trp Asp Ala Trp 
865 870 875 880 

Ala Ala Gin Pro Ser Leu Thr Val Pro Ser Ala Gly Gin Leu Arg Gin 
885 890 895 

Arg Ser Ala Phe Thr Pro Ala Thr Ala Thr Gly Ser Ser Pro Ser Pro 
900 905 910 

Val Leu Gly Gin Gly Glu Lys Val Glu Gly Leu Gin Ala Gin Ala Leu 
915 920 925 

Tyr Pro Trp Arg Ala Lys Lys Asp Asn His Leu Asn Phe Asn Lys Asn 
930 935 940 

Asp Val He Thr Val Leu Glu Gin Gin Asp Met Trp Trp Phe Gly Glu 
945 950 955 960 



Val Gin Gly Gin 

Ser Gly Pro He 
980 



Lys Gly Trp Phe 
965 

Arg Lys Ser Thr 



Pro Lys Ser Tyr 
970 

Ser Met Asp Ser 
985 



Val Lys Leu He 
975 

Gly Ser Ser Glu 
990 



Ser Pro Ala Ser Leu Lys Arg Val Ala Ser Pro Ala Ala Lys Pro Val 
995 1000 1005 



Val Ser Gly Glu Glu Phe He Ala Met Tyr Thr Tyr Glu Ser Ser Glu 
1010 1015 1020 

Gin Gly Asp Leu Thr Phe Gin Gin Gly Asp Val He Leu Val Thr Lys 
1025 1030 1035 1040 

Lys Asp Gly Asp Trp Trp Thr Gly Thr Val Gly Asp Lys Ala Gly Val 
1045 1050 1055 

Phe Pro Ser Asn Tyr Val Arg Leu Lys Asp Ser Glu Gly Ser Gly Thr 
1060 1065 1070 

Ala Gly Lys Thr Gly Ser Leu Gly Lys Lys Pro Glu He Ala Gin Val 
1075 1080 1085 

He Ala Ser Tyr Thr Ala Thr Gly Pro Glu Gin Leu Thr Leu Ala Pro 
1090 1095 1100 

Gly Gin Leu He Leu He Arg Lys Lys Asn Pro Gly Gly Trp Trp Glu 
1105 1110 1115 1120 

Gly Glu Leu Gin Ala Arg Gly Lys Lys Arg Gin He Gly Trp Phe Pro 
1125 1130 1135 

Ala Asn Tyr Val Lys Leu Leu Ser Pro Gly Thr Ser Lys lie Thr Pro 
1140 1145 1150 

Thr Glu Pro Pro Lys Ser Thr Ala Leu Ala Ala Val Cys Gin Val He 
1155 1160 1165 

Gly Met Tyr Asp Tyr Thr Ala Gin Asn Asp Asp Glu Leu Ala Phe Asn 
1170 1175 1180 

Lys Gly Gin He He Asn Val Leu Asn Lys Glu Asp Pro Asp Trp Trp 
1185 1190 1195 1200 

Lys Gly Glu Val Asn Gly Gin Val Gly Leu Phe Pro Ser Asn Tyr Val 
1205 1210 1215 

Lys Leu Thr Thr Asp Met Asp Pro Ser Gin Gin 
1220 1225 



<210> 9 
<211> 10 
<212> PRT 

<213> Homo sapiens 
<400> 9 

He He Cys Cys Pro Ser Pro Pro Gin Ala 
15 10 



<210> 10 



<211> 15 
<212> PRT 

<213> Homo sapiens 
<400> 10 

Lys Ser Ser Lys Arg Pro Thr lie Pro Tyr His Cys Pro Glu Gly 
15 10 15 



<210> 11 
<211> 5 
<212> PRT 

<213> Homo sapiens 
<400> 11 

Trp Glu Met Gin Pro 
1 5 



<210> 12 
<211> 13 
<212> PRT 
M <213> Homo sapiens 



<400> 12 

Ser Cys Asp Phe Gin His Asp His Leu Leu Pro Ser Glu 
15 10 



<210> 13 
<211> 20 
<212> PRT 
W <213> Homo sapiens 

0 <4 00> 13 

fit Lys Asn Ser Leu Gin Ser Ser Leu Pro His Phe Thr Leu Val Ala Cys 
Hi 1 5 10 15 



Asp Arg Asn Val 
20 



<210> 14 
<211> 28 
<212> PRT 

<213> Homo sapiens 
<400> 14 

Val lie Thr Cys Arg Asp Arg Ser Lys Asn Tyr Lys Asn Thr Gin Gly 
15 10 15 

Ser Gly Ser Phe Cys Gly Phe Pro Ser Tyr Ser Asn 
20 25 



<210> 15 
<211> 30 



<212> PRT 

<213> Homo sapiens 



<400> 15 

Leu Ser Pro Thr Phe Ala Gin Val Leu Ser lie Val Leu Lys Leu Phe 
15 10 15 

Leu Asn lie Tyr Phe Ser Phe Leu He Asn Lys He Asn Lys 
20 25 30 



<210> 16 
<211> 20 
<212> PRT 

<213> Homo sapiens 
<400> 16 

Leu Leu Cys Tyr Phe Gly Phe Ala Lys Arg Pro Thr He Lys Glu Cys 
15 10 15 

Cys Met Cys Tyr 
20 



<210> 17 
<211> 34 
<212> PRT 

<213> Homo sapiens 
<400> 17 

Lys Leu Phe Gin Met Ser He Asn Leu Arg Leu Asp Val Phe Phe His 
15 10 15 

Phe Val Gin Cys Tyr Gin Leu Asn Cys Ala Val Trp Gly Phe Ser Pro 
20 25 30 



Leu Pro 



<210> 18 
<211> 13 
<212> PRT 

<213> Homo sapiens 
<400> 18 

Lys Cys Arg Gly Val Gin Tyr Leu Cys Phe Lys Asp Val 
15 10 



<210> 19 
<211> 4 
<212> PRT 

<213> Homo sapiens 

<400> 19 

Asn Glu Pro Asn 



<210> 20 
<211> 15 
<212> PRT 

<213> Homo sapiens 
<400> 20 

Ser Glu Gly Val Cys Ala Cys Leu Cys Val Ser Ala Val Pro Cys 
15 10 15 



<210> 21 
<211> 7 
<212> PRT 

<213> Homo sapiens 
<400> 21 

Ala Cys Asn Thr Ser Cys Thr 
1 5 



<210> 22 
<211> 29 
<212> PRT 

<213> Homo sapiens 
<400> 22 

Glu lie Ser Ser Phe His Gly Lys 
1 5 

lie lie Leu His Leu lie Leu Phe 
20 



Ala lie Thr Leu Tyr Asp Ala Leu 
10 15 

Cys Thr Val Thr Leu 
25 



<210> 23 
<211> 33 
<212> PRT 

<213> Homo sapiens 
<400> 23 

Pro His Glu Lys Ala Leu Cys Val Phe Val Arg Ser Gin lie Tyr Leu 
15 10 15 

Val Glu Leu Val Phe Cys Leu Gly Phe Leu He Leu Arg Val Cys He 
20 25 30 

Ala 



<210> 24 
<211> 2 
<212> PRT 

<213> Homo sapiens 



<400> 24 
Asn Gin 
1 



<210> 25 
<211> 16 
<212> PRT 

<213> Homo sapiens 
<400> 25 

Thr Thr Pro Leu Arg Ser Leu Arg Ser Thr lie Ser Thr Val Ser Phe 
15 10 15 



<210> 26 
<211> 14 
<212> PRT 

<213> Homo sapiens 
<400> 26 

Ser Leu Leu His Glu Val Leu Phe Gin Leu Leu Phe Met Glu 
15 10 



<400> 28 
Phe Ser 
1 



<210> 29 
<211> 29 
<212> PRT 

<213> Homo sapiens 
<400> 29 

Gin Glu Arg Met Tyr Arg Ser Leu Pro Ala lie Asn Phe Gin Cys Leu 
15 10 15 



<210> 27 
<211> 5 
<212> PRT 

<213> Homo sapiens 




<400> 27 

Pro lie Leu Asn Lys 
1 5 



<210> 28 
<211> 2 
<212> PRT 
<213> Homo 



sapiens 



His Phe Leu Thr Arg Leu Trp Asn Phe Tyr Arg Leu lie 
20 25 



<210> 30 
<211> 9 
<212> PRT 

<213> Homo sapiens 
<400> 30 

Asn Gly Ala His Gly Pro Phe Val Cys 
1 5 



<210> 31 

<211> 4 

<212> PRT 

<213> Homo sapiens 

<400> 31 
lie Cys Cys Ser 
1 



<210> 32 
<211> 33 
<212> PRT 

<213> Homo sapiens 
<400> 32 

Ser Pro Val Cys Leu Leu Asn Thr Ser Trp Lys Leu Ser lie Lys Met 
1 5 10 15 

Pro Ala Ala His Ser Thr Glu Asn Gly Ala Gly Gly Ala Ser Ser Thr 
20 25 30 

He 



<210> 33 
<211> 3 
<212> PRT 

<213> Homo sapiens 

<400> 33 
Leu Ser Ser 
1 



<210> 34 
<211> 50 
<212> PRT 

<213> Homo sapiens 
<400> 34 

Arg Leu Cys Asn Ala His Ser Pro Arg Val Leu Pro Ala Leu Ser Gly 
15 10 15 



Gly Cys Ala Gly Gly Arg Val Glu Val Leu Leu Leu Ser His Gly Ala 



20 



25 



30 



Glu Ser Glu Asp Leu Ser Ser Ser Phe Ser Cys Thr Ser Val Phe Ser 
35 40 45 

Arg lie 
50 



<210> 35 
<211> 1 
<212> PRT 

<213> Homo sapiens 

<400> 35 
Met 
1 



£3 



<210> 36 
<211> 2 
<212> PRT 

<213> Homo sapiens 

<400> 36 
Asn lie 
1 



<210> 37 

<211> 22 

<212> PRT 

<213> Homo sapiens 



Q <400> 37 

nj lie Tyr Lys Pro Ala Ala Leu Thr Thr Val lie Gin Pro Phe Glu Leu 
1 5 10 15 



Val Pro Cys lie Asp Asn 
20 



<210> 38 
<211> 12 
<212> PRT 

<213> Homo sapiens 
<400> 38 

lie Leu His Thr Lys Val Lys Lys Lys Lys Lys Lys 
15 10 



<210> 39 

<211> 5195 

<212> DNA 

<213> Homo sapiens 



<400> 39 

agagtggagg cgccagggga gggagcgtag cttggttgct ccgtagtacg gcggctcgcg 60 
aggaagaatc ccgagcgggc tccgggacgg acagagaggc gggcggggat ggtgtgcggg 12 0 
gctgcggctc ctgcgtccct cccagcggcg cgtgagcggc actgatttgt ccctggggcg 180 
gcagcgcgga cccgcccgga gatgaggcgt cgattagcaa ggtaaaagta acagaaccat 240 
ggctcagttt ccaacacctt ttggtggcag cctggatatc tgggccataa ctgtagagga 300 
aagagcgaag catgatcagc agttccatag tttaaagcca atatctggat tcattactgg 360 
tgatcaagct agaaactttt tttttcaatc tgggttacct caacctgttt tagcacagat 42 0 
atgggcacta gctgacatga ataatgatgg aagaatggat caagtggagt tttccatagc 480 
tatgaaactt atcaaactga agctacaagg atatcagcta ccctctgcac ttccccctgt 540 
catgaaacag caaccagttg ctatttctag cgcaccagca tttggtatgg gaggtatcgc 600 
cagcatgcca ccgcttacag ctgttgctcc agtgccaatg ggatccattc cagttgttgg 660 
aatgtctcca accctagtat cttctgttcc cacagcagct gtgccccccc tggctaacgg 72 0 
ggctccccct gttatacaac ctctgcctgc atttgctcat cctgcagcca cattgccaaa 780 
gagttcttcc tttagtagat ctggtccagg gtcacaacta aacactaaat tacaaaaggc 840 
acagtcattt gatgtggcca gtgtcccacc agtggcagag tgggctgttc ctcagtcatc 900 
aagactgaaa tacaggcaat tattcaatag tcatgacaaa actatgagtg gacacttaac 960 
aggtccccaa gcaagaacta ttcttatgca gtcaagttta ccacaggctc agctggcttc 102 0 
aatatggaat ctttctgaca ttgatcaaga tggaaaactt acagcagagg aatttatcct 1080 
ggcaatgcac ctcattgatg tagctatgtc tggccaacca ctgccacctg tcctgcctcc 1140 
agaatacatt ccaccttctt ttagaagagt tcgatctggc agtggtatat ctgtcataag 12 00 
ctcaacatct gtagatcaga ggctaccaga ggaaccagtt ttagaagatg aacaacaaca 1260 
attagaaaag aaattacctg taacgtttga agataagaag cgggagaact ttgaacgtgg 1320 
caacctggaa ctggagaaac gaaggcaagc tctcctggaa cagcagcgca aggagcagga 13 80 
gcgcctggcc cagctggagc gggcggagca ggagaggaag gagcgtgagc gccaggagca 1440 
agagcgcaaa agacaactgg aactggagaa gcaactggaa aagcagcggg agctagaacg 1500 
gcagagagag gaggagagga ggaaagaaat tgagaggcga gaggctgcaa aacgggaact 1560 
tgaaaggcaa cgacaacttg agtgggaacg gaatcgaagg caagaactac taaatcaaag 162 0 
aaacaaagaa caagaggaca tagttgtact gaaagcaaag aaaaagactt tggaatttga 1680 
attagaagct ctaaatgata aaaagcatca actagaaggg aaacttcaag atatcagatg 1740 
tcgattgacc acccaaaggc aagaaattga gagcacaaac aaatctagag agttgagaat 1800 
tgccgaaatc acccatctac agcaacaatt acaggaatct cagcaaatgc ttggaagact 1860 
tattccagaa aaacagatac tcaatgacca attaaaacaa gttcagcaga acagtttgca 192 0 
cagagattca cttgttacac ttaaaagagc cttagaagca aaagaactag ctcggcagca 1980 
cctacgagac caactggatg aagtggagaa agaaactaga tcaaaactac aggagattga 2 040 
tattttcaat aatcagctga aggaactaag agaaatacac aataagcaac aactccagaa 2100 
gcaaaagtcc atggaggctg aacgactgaa acagaaagaa caagaacgaa agatcataga 2160 
attagaaaaa caaaaagaag aagcccaaag acgagctcag gaaagggaca agcagtggct 222 0 
ggagcatgtg cagcaggagg acgagcatca gagaccaaga aaactccacg aagaggaaaa 22 80 
actgaaaagg gaggagagtg tcaaaaagaa ggatggcgag gaaaaaggca aacaggaagc 234 0 
acaagacaag ctgggtcggc ttttccatca acaccaagaa ccagctaagc cagctgtcca 2400 
ggcaccctgg tccactgcag aaaaaggtcc acttaccatt tctgcacagg aaaatgtaaa 2460 
agtggtgtat taccgggcac tgtacccctt tgaatccaga agccatgatg aaatcactat 252 0 
ccagccagga gacatagtca tggtggatga aagccaaact ggagaacccg gctggcttgg 2580 
aggagaatta aaaggaaaga cagggtggtt ccctgcaaac tatgcagaga aaatcccaga 264 0 
aaatgaggtt cccgctccag tgaaaccagt gactgattca acatctgccc ctgcccccaa 2700 
actggccttg cgtgagaccc ccgccccttt ggcagtaacc tcttcagagc cctccacgac 2760 
ccctaataac tgggccgact tcagctccac gtggcccacc agcacgaatg agaaaccaga 2820 
aacggataac tgggatgcat gggcagccca gccctctctc accgttccaa gtgccggcca 2880 
gttaaggcag aggtccgcct ttactccagc cacggccact ggctcctccc cgtctcctgt 2940 
gctaggccag ggtgaaaagg tggaggggct acaagctcaa gccctatatc cttggagagc 3 000 
caaaaaagac aaccacttaa attttaacaa aaatgatgtc atcaccgtcc tggaacagca 3 060 
agacatgtgg tggtttggag aagttcaagg tcagaagggt tggttcccca agtcttacgt 312 0 
gaaactcatt tcagggccca taaggaagtc tacaagcatg gattctggtt cttcagagag 3180 
tcctgctagt ctaaagcgag tagcctctcc agcagccaag ccggtcgttt cgggagaaga 3240 
atttattgcc atgtacactt acgagagttc tgagcaagga gatttaacct ttcagcaagg 33 00 
ggatgtgatt ttggttacca agaaagatgg tgactggtgg acaggaacag tgggcgacaa 33 60 



ggccggagtc ttcccttcta actatgtgag gcttaaagat tcagagggct ctggaactgc 342 0 
tgggaaaaca gggagtttag gaaaaaaacc tgaaattgcc caggttattg cctcatacac 3480 
cgccaccggc cccgagcagc tcactctcgc ccctggtcag ctgattttga tccgaaaaaa 3 54 0 
gaacccaggt ggatggtggg aaggagagct gcaagcacgt gggaaaaagc gccagatagg 3600 
ctggttccca gctaattatg taaagcttct aagccctggg acgagcaaaa tcactccaac 3660 
agagccacct aagtcaacag cattagcggc agtgtgccag gtgattggga tgtacgacta 372 0 
caccgcgcag aatgacgatg agctggcctt caacaagggc cagatcatca acgtcctcaa 3780 
caaggaggac cctgactggt ggaaaggaga agtcaatgga caagtggggc tcttcccatc 3 840 
caattatgtg aagctgacca cagacatgga cccaagccag caatgaatca tatgttgtcc 3 900 
atccccccct caggcttgaa agtccttttg tggctttcct agttactcaa attgactttc 3960 
ccccaccttt gcacaggtgc tttcaatagt tttaaaatta tttttaaata tatattttag 402 0 
ctttttaata aacaaaataa ataaatgact tctttgctat tttggttttg caaaaagacc 4080 
cactatcaag gaatgctgca tgtgctatta aaaattgttc caaatgtcca taaatctgag 4140 
acttgatgta tttfettcatt ttgtccagtg ttaccaacta aattgtgcag tttggggctt 4200 
ttccccctta ccatagaagt gcagaggagt tcagtatctc tgttttaaag acgtatagaa 4260 
tgagcccaat taaagcgaag gtgtttgtgc ttgtttgtgt gtatcagctg taccttgttg 43 2 0 
agcatgtaat acatcctgta cataagaaat tagttctttc catggcaaag ctattacctt 4380 
gtacgatgct ctaatcatat tgcatttaat tttattttgc acagtgacct tgtagccaca 444 0 
tgagaaagca ctctgtgttt ttgttcggtc tcagatttat ctggttgagt tggtgttttg 4500 
tttggggttt ttaattttgc gtgtttgcat agcataaaat cagtagacaa caccactgag 4560 
gtcgttacga tcaacgatat ccacagtctc tttttagtct ctgttacatg aagttttatt 462 0 
ccagttactt ttcatggaat gacctatttt gaacaagtaa ttttcttgac aagaaagaat 4680 
gtatagaagt ctccctgcaa ttaatttcca atgtttacat tttttaacta gactgtggaa 4740 
tttctacaga ttaatatgaa atggagctca tggtccgttt gtgtgttaga tatgctgtag 4800 
ctgaagccct gtttgtcttt taaacactag ttggaagctc tcaataaaaa tgcctgctgc 4860 
tcacagcaca gaaaatgggg cagggggagc ctcaagcaca atctagctgt cctcctaaag 492 0 
actctgtaat gctcactccc ctcgcgttct cccggcgctg tcgggaggct gtgctggtgg 4980 
tcgtgtagag gtccttctcc tttcacatgg tgcagagagc gaggacctct cctcctcgtt 5040 
cagttgcact tcagtatttt cacggatatg aatgtaaaat atataaatat ataaacctgc 5100 
ggctttaaca actgtaatac aaccttttga attagttccg tgtatagata attaaattct 5160 
tcatacaaaa gttaaaaaaa aaaaaaaaaa aaaaa 5195 



<210> 40 

<211> 1215 

<212> PRT 

<213> Homo sapiens 



<400> 40 

Met Ala Gin Phe Pro Thr Pro Phe 
1 5 

He Thr Val Glu Glu Arg Ala Lys 
20 

Lys Pro He Ser Gly Phe He Thr 
35 40 

Phe Gin Ser Gly Leu Pro Gin Pro 
50 55 

Ala Asp Met Asn Asn Asp Gly Arg 
65 70 



Gly Gly Ser Leu Asp He Trp Ala 
10 15 

His Asp Gin Gin Phe His Ser Leu 
25 30 

Gly Asp Gin Ala Arg Asn Phe Phe 
45 

Val Leu Ala Gin He Trp Ala Leu 
60 

Met Asp Gin Val Glu Phe Ser He 
75 80 



Ala Met Lys Leu He Lys Leu Lys Leu Gin Gly Tyr Gin Leu Pro Ser 
85 90 95 



Ala Leu Pro Pro Val Met Lys Gin Gin Pro Val Ala He Ser Ser Ala 



100 



105 



110 



Pro Ala Phe Gly Met Gly Gly lie Ala Ser Met Pro Pro Leu Thr Ala 
115 120 125 

Val Ala Pro Val Pro Met Gly Ser lie Pro Val Val Gly Met Ser Pro 
130 135 140 

Thr Leu Val Ser Ser Val Pro Thr Ala Ala Val Pro Pro Leu Ala Asn 
145 150 155 160 

Gly Ala Pro Pro Val lie Gin Pro Leu Pro Ala Phe Ala His Pro Ala 
165 170 175 

Ala Thr Leu Pro Lys Ser Ser Ser Phe Ser Arg Ser Gly Pro Gly Ser 
180 185 190 

Gin Leu Asn Thr Lys Leu Gin Lys Ala Gin Ser Phe Asp Val Ala Ser 
195 200 205 

Val Pro Pro Val Ala Glu Trp Ala Val Pro Gin Ser Ser Arg Leu Lys 
210 215 220 

Tyr Arg Gin Leu Phe Asn Ser His Asp Lys Thr Met Ser Gly His Leu 
225 230 235 240 

Thr Gly Pro Gin Ala Arg Thr lie Leu Met Gin Ser Ser Leu Pro Gin 
245 250 255 

Ala Gin Leu Ala Ser lie Trp Asn Leu Ser Asp lie Asp Gin Asp Gly 
260 265 270 

Lys Leu Thr Ala Glu Glu Phe lie Leu Ala Met His Leu lie Asp Val 
275 280 285 

Ala Met Ser Gly Gin Pro Leu Pro Pro Val Leu Pro Pro Glu Tyr lie 
290 295 300 

Pro Pro Ser Phe Arg Arg Val Arg Ser Gly Ser Gly lie Ser Val lie 
305 310 315 320 

Ser Ser Thr Ser Val Asp Gin Arg Leu Pro Glu Glu Pro Val Leu Glu 
325 330 335 

Asp Glu Gin Gin Gin Leu Glu Lys Lys Leu Pro Val Thr Phe Glu Asp 
340 345 350 



Lys Lys Arg Glu Asn Phe Glu Arg Gly Asn Leu Glu Leu Glu Lys Arg 
355 360 365 

Arg Gin Ala Leu Leu Glu Gin Gin Arg Lys Glu Gin Glu Arg Leu Ala 
370 375 380 

Gin Leu Glu Arg Ala Glu Gin Glu Arg Lys Glu Arg Glu Arg Gin Glu 
385 390 395 400 

Gin Glu Arg Lys Arg Gin Leu Glu Leu Glu Lys Gin Leu Glu Lys Gin 



405 410 415 

Arg Glu Leu Glu Arg Gin Arg Glu Glu Glu Arg Arg Lys Glu lie Glu 
420 425 430 

Arg Arg Glu Ala Ala Lys Arg Glu Leu Glu Arg Gin Arg Gin Leu Glu 
435 440 445 

Trp Glu Arg Asn Arg Arg Gin Glu Leu Leu Asn Gin Arg Asn Lys Glu 
450 455 460 

Gin Glu Asp lie Val Val Leu Lys Ala Lys Lys Lys Thr Leu Glu Phe 
465 470 475 480 

Glu Leu Glu Ala Leu Asn Asp Lys Lys His Gin Leu Glu Gly Lys Leu 
485 490 495 

Gin Asp lie Arg Cys Arg Leu Thr Thr Gin Arg Gin Glu lie Glu Ser 
500 505 510 

Thr Asn Lys Ser Arg Glu Leu Arg lie Ala Glu lie Thr His Leu Gin 
515 520 525 

Gin Gin Leu Gin Glu Ser Gin Gin Met Leu Gly Arg Leu lie Pro Glu 
530 535 540 

Lys Gin lie Leu Asn Asp Gin Leu Lys Gin Val Gin Gin Asn Ser Leu 
545 550 555 560 

His Arg Asp Ser Leu Val Thr Leu Lys Arg Ala Leu Glu Ala Lys Glu 
565 570 575 

Leu Ala Arg Gin His Leu Arg Asp Gin Leu Asp Glu Val Glu Lys Glu 
580 585 590 

Thr Arg Ser Lys Leu Gin Glu lie Asp lie Phe Asn Asn Gin Leu Lys 
595 600 605 

Glu Leu Arg Glu lie His Asn Lys Gin Gin Leu Gin Lys Gin Lys Ser 
610 615 620 

Met Glu Ala Glu Arg Leu Lys Gin Lys Glu Gin Glu Arg Lys lie lie 
625 630 635 640 

Glu Leu Glu Lys Gin Lys Glu Glu Ala Gin Arg Arg Ala Gin Glu Arg 
645 650 655 

Asp Lys Gin Trp Leu Glu His Val Gin Gin Glu Asp Glu His Gin Arg 
660 665 670 

Pro Arg Lys Leu His Glu Glu Glu Lys Leu Lys Arg Glu Glu Ser Val 
675 680 685 

Lys Lys Lys Asp Gly Glu Glu Lys Gly Lys Gin Glu Ala Gin Asp Lys 
690 695 700 



Leu Gly Arg Leu Phe His Gin His Gin Glu Pro Ala Lys Pro Ala Val 



705 



710 



715 



720 



Gin Ala Pro Trp Ser Thr Ala Glu Lys Gly Pro Leu Thr lie Ser Ala 
725 730 735 

Gin Glu Asn Val Lys Val Val Tyr Tyr Arg Ala Leu Tyr Pro Phe Glu 
740 745 750 

Ser Arg Ser His Asp Glu He Thr He Gin Pro Gly Asp He Val Met 
755 760 765 

Val Asp Glu Ser Gin Thr Gly Glu Pro Gly Trp Leu Gly Gly Glu Leu 
770 775 780 

Lys Gly Lys Thr Gly Trp Phe Pro Ala Asn Tyr Ala Glu Lys He Pro 
785 790 795 800 

Glu Asn Glu Val Pro Ala Pro Val Lys Pro Val Thr Asp Ser Thr Ser 
805 810 815 

Ala Pro Ala Pro Lys Leu Ala Leu Arg Glu Thr Pro Ala Pro Leu Ala 
820 825 830 

Q 

y.3 Val Thr Ser Ser Glu Pro Ser Thr Thr Pro Asn Asn Trp Ala Asp Phe 
\| 835 840 845 

I y 

Q Ser Ser Thr Trp Pro Thr Ser Thr Asn Glu Lys Pro Glu Thr Asp Asn 
% 850 855 860 

y 

w l£ Trp Asp Ala Trp Ala Ala Gin Pro Ser Leu Thr Val Pro Ser Ala Gly 
■*H 865 870 875 880 

0 Gin Leu Arg Gin Arg Ser Ala Phe Thr Pro Ala Thr Ala Thr Gly Ser 
hk 885 890 895 

G! 

flj Ser Pro Ser Pro Val Leu Gly Gin Gly Glu Lys Val Glu Gly Leu Gin 
$3> 900 905 910 

r * Ala Gin Ala Leu Tyr Pro Trp Arg Ala Lys Lys Asp Asn His Leu Asn 
915 920 925 

Phe Asn Lys Asn Asp Val He Thr Val Leu Glu Gin Gin Asp Met Trp 
930 935 940 

Trp Phe Gly Glu Val Gin Gly Gin Lys Gly Trp Phe Pro Lys Ser Tyr 
945 950 955 960 

Val Lys Leu He Ser Gly Pro He Arg Lys Ser Thr Ser Met Asp Ser 
965 970 975 

Gly Ser Ser Glu Ser Pro Ala Ser Leu Lys Arg Val Ala Ser Pro Ala 
980 985 990 

Ala Lys Pro Val Val Ser Gly Glu Glu Phe He Ala Met Tyr Thr Tyr 
995 1000 1005 



Glu Ser Ser Glu Gin Gly Asp Leu Thr Phe Gin Gin Gly Asp Val He 



1010 1015 1020 

Leu Val Thr Lys Lys Asp Gly Asp Trp Trp Thr Gly Thr Val Gly Asp 
1025 1030 1035 1040 

Lys Ala Gly Val Phe Pro Ser Asn Tyr Val Arg Leu Lys Asp Ser Glu 
1045 1050 1055 

Gly Ser Gly Thr Ala Gly Lys Thr Gly Ser Leu Gly Lys Lys Pro Glu 
1060 1065 1070 

He Ala Gin Val He Ala Ser Tyr Thr Ala Thr Gly Pro Glu Gin Leu 
1075 1080 1085 

Thr Leu Ala Pro Gly Gin Leu He Leu He Arg Lys Lys Asn Pro Gly 
1090 1095 1100 

Gly Trp Trp Glu Gly Glu Leu Gin Ala Arg Gly Lys Lys Arg Gin He 
1105 1110 1115 1120 

Gly Trp Phe Pro Ala Asn Tyr Val Lys Leu Leu Ser Pro Gly Thr Ser 
1125 1130 1135 

Lys He Thr Pro Thr Glu Pro Pro Lys Ser Thr Ala Leu Ala Ala Val 
1140 1145 1150 

Cys Gin Val He Gly Met Tyr Asp Tyr Thr Ala Gin Asn Asp Asp Glu 
1155 1160 1165 

Leu Ala Phe Asn Lys Gly Gin He He Asn Val Leu Asn Lys Glu Asp 
1170 1175 1180 

Pro Asp Trp Trp Lys Gly Glu Val Asn Gly Gin Val Gly Leu Phe Pro 
1185 1190 1195 1200 

Ser Asn Tyr Val Lys Leu Thr Thr Asp Met Asp Pro Ser Gin Gin 
1205 1210 1215 



<210> 41 
<211> 14 
<212> PRT 

<213> Homo sapiens 
<220> 

<223> From Seq ID 41 to ID 70, there are 30 pretein 

sequences translated from Seq ID No. 6. Together, 
they form the whole protein sequence. 

<400> 41 

Glu Trp Arg Arg Gin Gly Arg Glu Arg Ser Leu Val Ala Pro 
15 10 



<210> 42 
<211> 52 
<212> PRT 



<213> Homo sapiens 



<400> 42 
Tyr Gly Gly Ser 
1 

Arg Gly Gly Arg 
20 

Gin Arg Arg Val 
35 

Pro Ala Arg Arg 
50 



Arg Gly Arg lie 
5 

Gly Trp Cys Ala 

Ser Gly Thr Asp 
40 



Pro Ser Gly Leu 
10 

Gly Leu Arg Leu 
25 

Leu Ser Leu Gly 



Arg Asp Gly Gin 
15 

Leu Arg Pro Ser 
30 

Arg Gin Arg Gly 
45 



<210> 43 
<211> 3 
<212> PRT 

<213> Homo sapiens 

<400> 43 
Gly Val Asp 
1 



<210> 44 
<211> 1222 
<212> PRT 

<213> Homo sapiens 
<400> 44 

Gin Gly Lys Ser Asn Arg Thr Met Ala Gin Phe Pro Thr Pro Phe Gly 
15 10 15 

Gly Ser Leu Asp lie Trp Ala lie Thr Val Glu Glu Arg Ala Lys His 
20 25 30 

Asp Gin Gin Phe His Ser Leu Lys Pro lie Ser Gly Phe lie Thr Gly 
35 40 45 

Asp Gin Ala Arg Asn Phe Phe Phe Gin Ser Gly Leu Pro Gin Pro Val 
50 55 60 

Leu Ala Gin lie Trp Ala Leu Ala Asp Met Asn Asn Asp Gly Arg Met 
65 70 75 80 

Asp Gin Val Glu Phe Ser lie Ala Met Lys Leu lie Lys Leu Lys Leu 
85 90 95 

Gin Gly Tyr Gin Leu Pro Ser Ala Leu Pro Pro Val Met Lys Gin Gin 
100 105 110 

Pro Val Ala He Ser Ser Ala Pro Ala Phe Gly Met Gly Gly He Ala 
115 120 125 



Ser Met Pro Pro Leu Thr Ala Val Ala Pro Val Pro Met Gly Ser He 



130 



135 



140 



Pro Val Val Gly Met Ser Pro Thr Leu Val Ser Ser Val Pro Thr Ala 
145 150 155 160 

Ala Val Pro Pro Leu Ala Asn Gly Ala Pro Pro Val lie Gin Pro Leu 
155 170 175 

Pro Ala Phe Ala His Pro Ala Ala Thr Leu Pro Lys Ser Ser Ser Phe 
180 185 190 

Ser Arg Ser Gly Pro Gly Ser Gin Leu Asn Thr Lys Leu Gin Lys Ala 
195 200 205 

Gin Ser Phe Asp Val Ala Ser Val Pro Pro Val Ala Glu Trp Ala Val 
210 215 220 

Pro Gin Ser Ser Arg Leu Lys Tyr Arg Gin Leu Phe Asn Ser His Asp 
225 230 235 240 

Lys Thr Met Ser Gly His Leu Thr Gly Pro Gin Ala Arg Thr lie Leu 
245 250 255 

Met Gin Ser Ser Leu Pro Gin Ala Gin Leu Ala Ser lie Trp Asn Leu 

\| 260 265 270 

ru 

q Ser Asp lie Asp Gin Asp Gly Lys Leu Thr Ala Glu Glu Phe lie Leu 
S 275 280 285 

% Ala Met His Leu lie Asp Val Ala Met Ser Gly Gin Pro Leu Pro Pro 
290 295 300 

.W Val Leu Pro Pro Glu Tyr lie Pro Pro Ser Phe Arg Arg Val Arg Ser 
f* 305 310 315 320 

pj Gly Ser Gly lie Ser Val lie Ser Ser Thr Ser Val Asp Gin Arg Leu 

325 330 335 



Pro Glu Glu Pro Val Leu Glu Asp Glu Gin Gin Gin Leu Glu Lys Lys 
340 345 350 

Leu Pro Val Thr Phe Glu Asp Lys Lys Arg Glu Asn Phe Glu Arg Gly 

355 360 365 

Asn Leu Glu Leu Glu Lys Arg Arg Gin Ala Leu Leu Glu Gin Gin Arg 
370 375 380 

Lys Glu Gin Glu Arg Leu Ala Gin Leu Glu Arg Ala Glu Gin Glu Arg 
385 390 395 400 

Lys Glu Arg Glu Arg Gin Glu Gin Glu Arg Lys Arg Gin Leu Glu Leu 
405 410 415 

Glu Lys Gin Leu Glu Lys Gin Arg Glu Leu Glu Arg Gin Arg Glu Glu 
420 425 430 



Glu Arg Arg Lys Glu lie Glu Arg Arg Glu Ala Ala Lys Arg Glu Leu 



435 



440 



445 



Glu Arg Gin Arg Gin Leu Glu Trp Glu Arg Asn Arg Arg Gin Glu Leu 
450 455 460 

Leu Asn Gin Arg Asn Lys Glu Gin Glu Asp lie Val Val Leu Lys Ala 
465 470 475 480 

Lys Lys Lys Thr Leu Glu Phe Glu Leu Glu Ala Leu Asn Asp Lys Lys 
485 490 495 

His Gin Leu Glu Gly Lys Leu Gin Asp He Arg Cys Arg Leu Thr Thr 
500 505 510 

Gin Arg Gin Glu He Glu Ser Thr Asn Lys Ser Arg Glu Leu Arg He 
515 520 525 

Ala Glu He Thr His Leu Gin Gin Gin Leu Gin Glu Ser Gin Gin Met 
530 535 540 

Leu Gly Arg Leu He Pro Glu Lys Gin He Leu Asn Asp Gin Leu Lys 
545 550 555 560 

Gin Val Gin Gin Asn Ser Leu His Arg Asp Ser Leu Val Thr Leu Lys 
565 570 575 

Arg Ala Leu Glu Ala Lys Glu Leu Ala Arg Gin His Leu Arg Asp Gin 
580 585 590 

Leu Asp Glu Val Glu Lys Glu Thr Arg Ser Lys Leu Gin Glu He Asp 
595 600 605 

He Phe Asn Asn Gin Leu Lys Glu Leu Arg Glu He His Asn Lys Gin 
610 615 620 

Gin Leu Gin Lys Gin Lys Ser Met Glu Ala Glu Arg Leu Lys Gin Lys 
625 630 635 640 

Glu Gin Glu Arg Lys He He Glu Leu Glu Lys Gin Lys Glu Glu Ala 
645 650 655 

Gin Arg Arg Ala Gin Glu Arg Asp Lys Gin Trp Leu Glu His Val Gin 
660 665 670 

Gin Glu Asp Glu His Gin Arg Pro Arg Lys Leu His Glu Glu Glu Lys 
675 680 685 

Leu Lys Arg Glu Glu Ser Val Lys Lys Lys Asp Gly Glu Glu Lys Gly 
690 695 700 

Lys Gin Glu Ala Gin Asp Lys Leu Gly Arg Leu Phe His Gin His Gin 
705 710 715 720 

Glu Pro Ala Lys Pro Ala Val Gin Ala Pro Trp Ser Thr Ala Glu Lys 
725 730 735 



Gly Pro Leu Thr He Ser Ala Gin Glu Asn Val Lys Val Val Tyr Tyr 



740 



745 



750 



Arg Ala Leu Tyr Pro Phe Glu Ser Arg Ser His Asp Glu lie Thr lie 
755 760 765 

Gin Pro Gly Asp lie Val Met Val Asp Glu Ser Gin Thr Gly Glu Pro 
770 775 780 

Gly Trp Leu Gly Gly Glu Leu Lys Gly Lys Thr Gly Trp Phe Pro Ala 
785 790 795 800 

Asn Tyr Ala Glu Lys lie Pro Glu Asn Glu Val Pro Ala Pro Val Lys 
805 810 815 

Pro Val Thr Asp Ser Thr Ser Ala Pro Ala Pro Lys Leu Ala Leu Arg 
820 825 830 

Glu Thr Pro Ala Pro Leu Ala Val Thr Ser Ser Glu Pro Ser Thr Thr 
835 840 845 

Pro Asn Asn Trp Ala Asp Phe Ser Ser Thr Trp Pro Thr Ser Thr Asn 
850 855 860 

Glu Lys Pro Glu Thr Asp Asn Trp Asp Ala Trp Ala Ala Gin Pro Ser 
865 870 875 880 

Leu Thr Val Pro Ser Ala Gly Gin Leu Arg Gin Arg Ser Ala Phe Thr 
885 890 895 

Pro Ala Thr Ala Thr Gly Ser Ser Pro Ser Pro Val Leu Gly Gin Gly 
900 905 910 

Glu Lys Val Glu Gly Leu Gin Ala Gin Ala Leu Tyr Pro Trp Arg Ala 
915 920 925 

Lys Lys Asp Asn His Leu Asn Phe Asn Lys Asn Asp Val lie Thr Val 
930 935 940 

Leu Glu Gin Gin Asp Met Trp Trp Phe Gly Glu Val Gin Gly Gin Lys 
945 950 955 960 

Gly Trp Phe Pro Lys Ser Tyr Val Lys Leu lie Ser Gly Pro lie Arg 
965 970 975 

Lys Ser Thr Ser Met Asp Ser Gly Ser Ser Glu Ser Pro Ala Ser Leu 
980 985 990 

Lys Arg Val Ala Ser Pro Ala Ala Lys Pro Val Val Ser Gly Glu Glu 
995 1000 1005 

Phe lie Ala Met Tyr Thr Tyr Glu Ser Ser Glu Gin Gly Asp Leu Thr 
1010 1015 1020 

Phe Gin Gin Gly Asp Val lie Leu Val Thr Lys Lys Asp Gly Asp Trp 
1025 1030 1035 1040 



Trp Thr Gly Thr Val Gly Asp Lys Ala Gly Val Phe Pro Ser Asn Tyr 



1045 1050 1055 

Val Arg Leu Lys Asp Ser Glu Gly Ser Gly Thr Ala Gly Lys Thr Gly 
1060 1065 1070 

Ser Leu Gly Lys Lys Pro Glu He Ala Gin Val He Ala Ser Tyr Thr 
1075 1080 1085 

Ala Thr Gly Pro Glu Gin Leu Thr Leu Ala Pro Gly Gin Leu He Leu 
1090 1095 1100 

He Arg Lys Lys Asn Pro Gly Gly Trp Trp Glu Gly Glu Leu Gin Ala 
1105 1110 1115 H20 

Arg Gly Lys Lys Arg Gin He Gly Trp Phe Pro Ala Asn Tyr Val Lys 
1125 1130 1135 

Leu Leu Ser Pro Gly Thr Ser Lys He Thr Pro Thr Glu Pro Pro Lys 
1140 1145 1150 

Ser Thr Ala Leu Ala Ala Val Cys Gin Val He Gly Met Tyr Asp Tyr 
1155 1160 1165 

Thr Ala Gin Asn Asp Asp Glu Leu Ala Phe Asn Lys Gly Gin He He 
1170 1175 1180 

Asn Val Leu Asn Lys Glu Asp Pro Asp Trp Trp Lys Gly Glu Val Asn 
.1185 1190 1195 1200 

Gly Gin Val Gly Leu Phe Pro Ser Asn Tyr Val Lys Leu Thr Thr Asp 
1205 1210 1215 

Met Asp Pro Ser Gin Gin 
1220 



<210> 45 
<211> 10 
<212> PRT 

<213> Homo sapiens 
<400> 45 

He He Cys Cys Pro Ser Pro Pro Gin Ala 
15 10 



<210> 46 
<211> 11 
<212> PRT 

<213> Homo sapiens 
<400> 46 

Lys Ser Phe Cys Gly Phe Pro Ser Tyr Ser Asn 
15 10 



<210> 47 



<211> 30 
<212> PRT 

<213> Homo sapiens 
<400> 47 

Leu Ser Pro Thr Phe Ala Gin Val Leu Ser lie Val Leu Lys Leu Phe 
15 10 15 

Leu Asn lie Tyr Phe Ser Phe Leu lie Asn Lys lie Asn Lys 
20 25 30 



<210> 48 
<211> 20 
<212> PRT 

<213> Homo sapiens 
<400> 48 

Leu Leu Cys Tyr Phe Gly Phe Ala Lys Arg Pro Thr lie Lys Glu Cys 
15 10 15 

Cys Met Cys Tyr 
20 



"SS3$f 

ill <210> 49 

p <211> 34 

,n <212> PRT 

l;| <213> Homo sapiens 

^ <400> 49 

^ Lys Leu Phe Gin Met Ser lie Asn Leu Arg Leu Asp Val Phe Phe His 
CJ i 5 10 15 

CJ Phe Val Gin Cys Tyr Gin Leu Asn Cys Ala Val Trp Gly Phe Ser Pro 
Hj 2 0 25 3 0 

Ls, Leu Pro 



<210> 50 
<211> 13 
<212> PRT 

<213> Homo sapiens 
<400> 50 

Lys Cys Arg Gly Val Gin Tyr Leu Cys Phe Lys Asp Val 
15 10 



<210> 51 

<211> 4 

<212> PRT 

<213> Homo sapiens 



<400> 51 



Asn Glu Pro Asn 
1 



<210> 52 
<211> 15 
<212> PRT 

<213> Homo sapiens 
<400> 52 

Ser Glu Gly Val Cys Ala Cys Leu Cys Val Ser Ala Val Pro Cys 
15 10 15 



<210> 53 
<211> 7 
<212> PRT 

<213> Homo sapiens 
<400> 53 

Ala Cys Asn Thr Ser Cys Thr 
1 5 



<210> 54 
<211> 29 
<212> PRT 

<213> Homo sapiens 
<400> 54 

Glu lie Ser Ser Phe His Gly Lys 
1 5 

lie lie Leu His Leu lie Leu Phe 
20 



Ala lie Thr Leu Tyr Asp Ala Leu 
10 15 

Cys Thr Val Thr Leu 
25 



<210> 55 
<211> 33 
<212> PRT 

<213> Homo sapiens 
<400> 55 

Pro His Glu Lys Ala Leu Cys Val Phe Val Arg Ser Gin lie Tyr Leu 
15 10 15 

Val Glu Leu Val Phe Cys Leu Gly Phe Leu lie Leu Arg Val Cys lie 
20 25 30 

Ala 



<210> 56 

<211> 2 

<212> PRT 

<213> Homo sapiens 



<400> 56 
Asn Gin 
1 



<210> 57 
<211> 16 
<212> PRT 

<213> Homo sapiens 
<400> 57 

Thr Thr Pro Leu Arg Ser Leu Arg Ser Thr lie Ser Thr Val Ser Phe 
15 10 15 



<210> 58 
<211> 14 
<212> PRT 

<213> Homo sapiens 
<400> 58 

Ser Leu Leu His Glu Val Leu Phe Gin Leu Leu Phe Met Glu 
15 10 



<210> 59 
<211> 5 
<212> PRT 

<213> Homo sapiens 
<400> 59 

Pro lie Leu Asn Lys 
1 5 



<210> 60 

<211> 2 

<212> PRT 

<213> Homo sapiens 

<400> 60 
Phe Ser 
1 



<210> 61 
<211> 29 
<212> PRT 

<213> Homo sapiens 
<400> 61 

Gin Glu Arg Met Tyr Arg Ser Leu Pro Ala lie Asn Phe Gin Cys Leu 
15 10 15 

His Phe Leu Thr Arg Leu Trp Asn Phe Tyr Arg Leu lie 
20 25 



<210> 62 
<211> 9 
<212> PRT 

<213> Homo sapiens 
<400> 62 

Asn Gly Ala His Gly Pro Phe Val Cys 
1 5 



<210> 63 

<211> 4 

<212> PRT 

<213> Homo sapiens 

<400> 63 
lie Cys Cys Ser 
1 



^ <210> 64 

^ <211> 33 

M <212> PRT 

FJ <213> Homo sapiens 

fl 

•J3 <40 0> 64 

Ser Pro Val Cys Leu Leu Asn Thr Ser Trp Lys Leu Ser He Lys Met 
15 10 15 



Pro Ala Ala His Ser Thr Glu Asn Gly Ala Gly Gly Ala Ser Ser Thr 
20 25 30 



He 



<210> 65 
<211> 3 
<212> PRT 

<213> Homo sapiens 



<400> 65 
Leu Ser Ser 
1 



<210> 66 

<211> 50 

<212> PRT 

<213> Homo 



sapiens 



<400> 66 

Arg Leu Cys Asn Ala His Ser Pro Arg Val Leu Pro Ala Leu Ser Gly 
15 10 15 



Gly Cys Ala Gly Gly Arg Val Glu Val Leu Leu Leu Ser His Gly Ala 
20 25 30 



Glu Ser Glu Asp Leu Ser Ser Ser Phe Ser Cys Thr Ser Val Phe Ser 
35 40 45 

Arg He 
50 



<210> 67 
<211> 1 
<212> PRT 

<213> Homo sapiens 

<400> 67 
Met 
1 



<210> 68 
<211> 2 
<212> PRT 

<213> Homo sapiens 

<400> 68 
Asn He 
1 



<210> 69 
<211> 22 
<212> PRT 

<213> Homo sapiens 
<400> 69 

He Tyr Lys Pro Ala Ala Leu Thr Thr Val He Gin Pro Phe Glu Leu 
15 10 15 

Val Pro Cys He Asp Asn 
20 



<210> 70 
<211> 13 
<212> PRT 

<213> Homo sapiens 
<400> 70 

He Leu His Thr Lys Val Lys Lys Lys Lys Lys Lys Lys 
15 10 



<210> 71 

<211> 2079 

<212> DNA 

<213> Homo sapiens 



<400> 71 

cggggatggt gtgcggggct gcggctcctg 
gatttgtccc tggggcggca gcgcggaccc 
aaaagtaaca gaaccatggc tcagtttcca 
gccataactg tagaggaaag agcgaagcat 
tctggattca ttactggtga tcaagctaga 
cctgttttag cacagatatg ggcactagct 
gtggagtttt ccatagctat gaaacttatc 
tctgcacttc cccctgtcat gaaacagcaa 
ggtatgggag gtatcgccag catgccaccg 
tccattccag ttgttggaat gtctccaacc 
ccccccctgg ctaacggggc tccccctgtt 
gcagccacat tgccaaagag ttcttccttt 
actaaattac aaaaggcaca gtcatttgat 
gctgttcctc agtcatcaag actgaaatac 
atgagtggac acttaacagg tccccaagca 
caggctcagc tggcttcaat atggaatctt 
gcagaggaat ttatcctggc aatgcacctc 
ccacctgtcc tgcctccaga atacattcca 
ggtatatctg tcataagctc aacatctgta 
gaagatgaac aacaacaatt agaaaagaaa 
gagaactttg aacgtggcaa cctggaactg 
cagcgcaagg agcaggagcg cctggcccag 
cgtgagcgcc aggagcaaga gcgcaaaaga 
cagcgggagc tagaacggca gagagaggag 
gctgcaaaac gggaacttga aaggcaacga 
gaactactaa atcaaagaaa caaagaacaa 
aagactttgg aatttgaatt agaagctcta 
cttcaagata tcagatgtcg attgaccacc 
tctagagagt tgagaattgc cgaaatcacc 
caaatgcttg gaagacttat tccagaaaaa 
cagcagaaca gtttgcacag agattcactt 
gaactagctc ggcagcacct acgagaccaa 
aaactacagg agattgatat tttcaataat 
aagcaacaac tccagaagca aaagtccatg 
gaacgaaaga tcatagaatt agaaaaaaaa 



cgtccctccc agcggcgcgt gagcggcact 60 
gcccggagat gaggcgtcga ttagcaaggt 12 0 
acaccttttg gtggcagcct ggatatctgg 180 
gatcagcagt tccatagttt aaagccaata 240 
aacttttttt ttcaatctgg gttacctcaa 300 
gacatgaata atgatggaag aatggatcaa 3 60 
aaactgaagc tacaaggata tcagctaccc 42 0 
ccagttgcta tttctagcgc accagcattt 480 
cttacagctg ttgctccagt gccaatggga 540 
ctagtatctt ctgttcccac agcagctgtg 60 0 
atacaacctc tgcctgcatt tgctcatcct 660 
agtagatctg gtccagggtc acaactaaac 72 0 
gtggccagtg tcccaccagt ggcagagtgg 780 
aggcaattat tcaatagtca tgacaaaact 84 0 
agaactattc ttatgcagtc aagtttacca 90 0 
tctgacattg atcaagatgg aaaacttaca 960 
attgatgtag ctatgtctgg ccaaccactg 1020 
ccttctttta gaagagttcg atctggcagt 1080 
gatcagaggc taccagagga accagtttta 114 0 
ttacctgtaa cgtttgaaga taagaagcgg 1200 
gagaaacgaa ggcaagctct cctggaacag 12 60 
ctggagcggg cggagcagga gaggaaggag 132 0 
caactggaac tggagaagca actggaaaag 13 80 
gagaggagga aagaaattga gaggcgagag 144 0 
caacttgagt gggaacggaa tcgaaggcaa 1500 
gaggacatag ttgtactgaa agcaaagaaa 15 60 
aatgataaaa agcatcaact agaagggaaa 162 0 
caaaggcaag aaattgagag cacaaacaaa 1680 
catctacagc aacaattaca ggaatctcag 174 0 
cagatactca atgaccaatt aaaacaagtt 180 0 
gttacactta aaagagcctt agaagcaaaa 1860 
ctggatgaag tggagaaaga aactagatca 1920 
cagctgaagg aactaagaga aatacacaat 1980 
gaggctgaac gactgaaaca gaaagaacaa 2 040 
aaaaaaaaa 2 07 9 



<210> 72 
<211> 648 
<212> PRT 

<213> Homo sapiens 



<400> 72 

Met Ala Gin Phe Pro Thr Pro Phe 
1 5 

lie Thr Val Glu Glu Arg Ala Lys 
20 

Lys Pro lie Ser Gly Phe lie Thr 
35 40 

Phe Gin Ser Gly Leu Pro Gin Pro 
50 55 

Ala Asp Met Asn Asn Asp Gly Arg 



Gly Gly Ser Leu Asp lie Trp Ala 
10 15 

His Asp Gin Gin Phe His Ser Leu 
25 30 

Gly Asp Gin Ala Arg Asn Phe Phe 
45 

Val Leu Ala Gin lie Trp Ala Leu 
60 

Met Asp Gin Val Glu Phe Ser He 



65 



70 



75 



80 



Ala Met Lys Leu lie Lys Leu Lys Leu Gin Gly Tyr Gin Leu Pro Ser 
85 90 95 

Ala Leu Pro Pro Val Met Lys Gin Gin Pro Val Ala lie Ser Ser Ala 
100 105 110 

Pro Ala Phe Gly Met Gly Gly lie Ala Ser Met Pro Pro Leu Thr Ala 
115 120 125 

Val Ala Pro Val Pro Met Gly Ser lie Pro Val Val Gly Met Ser Pro 
130 135 140 

Thr Leu Val Ser Ser Val Pro Thr Ala Ala Val Pro Pro Leu Ala Asn 
145 150 155 160 

Gly Ala Pro Pro Val lie Gin Pro Leu Pro Ala Phe Ala His Pro Ala 
165 170 175 

Ala Thr Leu Pro Lys Ser Ser Ser Phe Ser Arg Ser Gly Pro Gly Ser 
180 185 190 

Gin Leu Asn Thr Lys Leu Gin Lys Ala Gin Ser Phe Asp Val Ala Ser 
195 200 205 

Val Pro Pro Val Ala Glu Trp Ala Val Pro Gin Ser Ser Arg Leu Lys 
210 215 220 

Tyr Arg Gin Leu Phe Asn Ser His Asp Lys Thr Met Ser Gly His Leu 
225 230 235 240 

Thr Gly Pro Gin Ala Arg Thr lie Leu Met Gin Ser Ser Leu Pro Gin 
245 250 255 

Ala Gin Leu Ala Ser lie Trp Asn Leu Ser Asp lie Asp Gin Asp Gly 
260 265 270 

Lys Leu Thr Ala Glu Glu Phe lie Leu Ala Met His Leu lie Asp Val 
275 280 285 

Ala Met Ser Gly Gin Pro Leu Pro Pro Val Leu Pro Pro Glu Tyr lie 
290 295 300 

Pro Pro Ser Phe Arg Arg Val Arg Ser Gly Ser Gly lie Ser Val He 
305 310 315 320 

Ser Ser Thr Ser Val Asp Gin Arg Leu Pro Glu Glu Pro Val Leu Glu 
325 330 335 

Asp Glu Gin Gin Gin Leu Glu Lys Lys Leu Pro Val Thr Phe Glu Asp 
340 345 350 

Lys Lys Arg Glu Asn Phe Glu Arg Gly Asn Leu Glu Leu Glu Lys Arg 
355 360 365 



Arg Gin Ala Leu Leu Glu Gin Gin Arg Lys Glu Gin Glu Arg Leu Ala 



370 375 380 

Gin Leu Glu Arg Ala Glu Gin Glu Arg Lys Glu Arg Glu Arg Gin Glu 
385 390 395 400 

Gin Glu Arg Lys Arg Gin Leu Glu Leu Glu Lys Gin Leu Glu Lys Gin 
405 410 415 

Arg Glu Leu Glu Arg Gin Arg Glu Glu Glu Arg Arg Lys Glu He Glu 
420 425 430 

Arg Arg Glu Ala Ala Lys Arg Glu Leu Glu Arg Gin Arg Gin Leu Glu 
435 440 445 

Trp Glu Arg Asn Arg Arg Gin Glu Leu Leu Asn Gin Arg Asn Lys Glu 
450 455 460 

Gin Glu Asp He Val Val Leu Lys Ala Lys Lys Lys Thr Leu Glu Phe 
465 470 475 480 

Glu Leu Glu Ala Leu Asn Asp Lys Lys His Gin Leu Glu Gly Lys Leu 
485 490 495 

Gin Asp He Arg Cys Arg Leu Thr Thr Gin Arg Gin Glu He Glu Ser 
500 505 510 

Thr Asn Lys Ser Arg Glu Leu Arg He Ala Glu He Thr His Leu Gin 
515 520 525 

Gin Gin Leu Gin Glu Ser Gin Gin Met Leu Gly Arg Leu He Pro Glu 
530 535 540 

Lys Gin He Leu Asn Asp Gin Leu Lys Gin Val Gin Gin Asn Ser Leu 
545 550 555 560 

His Arg Asp Ser Leu Val Thr Leu Lys Arg Ala Leu Glu Ala Lys Glu 
565 570 575 

Leu Ala Arg Gin His Leu Arg Asp Gin Leu Asp Glu Val Glu Lys Glu 
580 585 590 

Thr Arg Ser Lys Leu Gin Glu He Asp He Phe Asn Asn Gin Leu Lys 
595 600 605 

Glu Leu Arg Glu He His Asn Lys Gin Gin Leu Gin Lys Gin Lys Ser 
610 615 620 

Met Glu Ala Glu Arg Leu Lys Gin Lys Glu Gin Glu Arg Lys He He 
625 630 635 640 

Glu Leu Glu Lys Lys Lys Lys Lys 
645 



<210> 73 
<211> 33 
<212> PRT 



<213> Homo sapiens 



<220> 

<223> From Seq ID 73 to ID 75, there are 3 pretein 

sequences translated from Seq ID No. 71. Together, 
they form the whole protein sequence. 

<400> 73 

Arg Gly Trp Cys Ala Gly Leu Arg Leu Leu Arg Pro Ser Gin Arg Arg 
15 10 15 

Val Ser Gly Thr Asp Leu Ser Leu Gly Arg Gin Arg Gly Pro Ala Arg 
20 25 30 



Arg 



<210> 74 
<211> 3 
<212> PRT 

<213> Homo sapiens 

<400> 74 
Gly Val Asp 
1 



<210> 75 
<211> 655 
<212> PRT 

<213> Homo sapiens 
<400> 75 

Gin Gly Lys Ser Asn Arg Thr Met Ala Gin Phe Pro Thr Pro Phe Gly 
15 10 15 

Gly Ser Leu Asp lie Trp Ala He Thr Val Glu Glu Arg Ala Lys His 
20 25 30 

Asp Gin Gin Phe His Ser Leu Lys Pro He Ser Gly Phe He Thr Gly 
35 40 45 

Asp Gin Ala Arg Asn Phe Phe Phe Gin Ser Gly Leu Pro Gin Pro Val 
50 55 60 

Leu Ala Gin He Trp Ala Leu Ala Asp Met Asn Asn Asp Gly Arg Met 
65 70 75 80 

Asp Gin Val Glu Phe Ser He Ala Met Lys Leu He Lys Leu Lys Leu 
85 90 95 

Gin Gly Tyr Gin Leu Pro Ser Ala Leu Pro Pro Val Met Lys Gin Gin 
100 105 110 



Pro Val Ala He Ser Ser Ala Pro Ala Phe Gly Met Gly Gly He Ala 
115 120 125 



Ser Met Pro Pro Leu Thr Ala Val Ala Pro Val Pro Met Gly Ser He 
130 135 140 

Pro Val Val Gly Met Ser Pro Thr Leu Val Ser Ser Val Pro Thr Ala 
145 150 155 160 

Ala Val Pro Pro Leu Ala Asn Gly Ala Pro Pro Val He Gin Pro Leu 
165 170 175 

Pro Ala Phe Ala His Pro Ala Ala Thr Leu Pro Lys Ser Ser Ser Phe 
180 185 190 

Ser Arg Ser Gly Pro Gly Ser Gin Leu Asn Thr Lys Leu Gin Lys Ala 
195 200 205 

Gin Ser Phe Asp Val Ala Ser Val Pro Pro Val Ala Glu Trp Ala Val 
210 215 220 

Pro Gin Ser Ser Arg Leu Lys Tyr Arg Gin Leu Phe Asn Ser His Asp 
225 230 235 240 

Lys Thr Met Ser Gly His Leu Thr Gly Pro Gin Ala Arg Thr He Leu 
245 250 255 

Met Gin Ser Ser Leu Pro Gin Ala Gin Leu Ala Ser He Trp Asn Leu 
260 265 270 

Ser Asp He Asp Gin Asp Gly Lys Leu Thr Ala Glu Glu Phe He Leu 
275 280 285 

Ala Met His Leu He Asp Val Ala Met Ser Gly Gin Pro Leu Pro Pro 
290 295 300 

Val Leu Pro Pro Glu Tyr He Pro Pro Ser Phe Arg Arg Val Arg Ser 
305 310 315 320 

Gly Ser Gly He Ser Val He Ser Ser Thr Ser Val Asp Gin Arg Leu 
325 330 335 

Pro Glu Glu Pro Val Leu Glu Asp Glu Gin Gin Gin Leu Glu Lys Lys 
340 345 350 

Leu Pro Val Thr Phe Glu Asp Lys Lys Arg Glu Asn Phe Glu Arg Gly 
355 360 365 

Asn Leu Glu Leu Glu Lys Arg Arg Gin Ala Leu Leu Glu Gin Gin Arg 
370 375 380 

Lys Glu Gin Glu Arg Leu Ala Gin Leu Glu Arg Ala Glu Gin Glu Arg 
385 390 395 400 

Lys Glu Arg Glu Arg Gin Glu Gin Glu Arg Lys Arg Gin Leu Glu Leu 
405 410 415 

Glu Lys Gin Leu Glu Lys Gin Arg Glu Leu Glu Arg Gin Arg Glu Glu 
420 425 430 



Glu Arg Arg Lys Glu lie Glu Arg Arg Glu Ala Ala Lys Arg Glu Leu 
435 440 445 

Glu Arg Gin Arg Gin Leu Glu Trp Glu Arg Asn Arg Arg Gin Glu Leu 
450 455 460 

Leu Asn Gin Arg Asn Lys Glu Gin Glu Asp lie Val Val Leu Lys Ala 
465 470 475 480 

Lys Lys Lys Thr Leu Glu Phe Glu Leu Glu Ala Leu Asn Asp Lys Lys 
485 490 495 

His Gin Leu Glu Gly Lys Leu Gin Asp lie Arg Cys Arg Leu Thr Thr 
500 505 510 

Gin Arg Gin Glu lie Glu Ser Thr Asn Lys Ser Arg Glu Leu Arg lie 
515 520 525 

Ala Glu lie Thr His Leu Gin Gin Gin Leu Gin Glu Ser Gin Gin Met 
530 535 540 

Leu Gly Arg Leu lie Pro Glu Lys Gin lie Leu Asn Asp Gin Leu Lys 
545 550 555 560 

Gin Val Gin Gin Asn Ser Leu His Arg Asp Ser Leu Val Thr Leu Lys 
565 570 575 

Arg Ala Leu Glu Ala Lys Glu Leu Ala Arg Gin His Leu Arg Asp Gin 
580 585 590 

Leu Asp Glu Val Glu Lys Glu Thr Arg Ser Lys Leu Gin Glu lie Asp 
595 600 605 

lie Phe Asn Asn Gin Leu Lys Glu Leu Arg Glu lie His Asn Lys Gin 
610 615 620 

Gin Leu Gin Lys Gin Lys Ser Met Glu Ala Glu Arg Leu Lys Gin Lys 
625 630 635 640 

Glu Gin Glu Arg Lys lie lie Glu Leu Glu Lys Lys Lys Lys Lys 
645 650 655 



<210> 76 

<211> 3231 

<212> DNA 

<213> Homo sapiens 

<400> 76 

gaccacccaa aggcaagaaa ttgagagcac 
aatcacccat ctacagcaac aattacagga 
agaaaaacag atactcaatg accaattaaa 
ttcacttgtt acacttaaaa gagccttaga 
agaccaactg gatgaagtgg agaaagaaac 
caataatcag ctgaaggaac taagagaaat 
gtccatggag gctgaacgac tgaaacagaa 



aaacaaatct agagagttga gaattgccga 60 
atctcagcaa atgcttggaa gacttattcc 12 0 
acaagttcag cagaacagtt tgcacagaga 18 0 
agcaaaagaa ctagctcggc agcacctacg 24 0 
tagatcaaaa ctacaggaga ttgatatttt 300 
acacaataag caacaactcc agaagcaaaa 3 60 
agaacaagaa cgaaagatca tagaattaga 42 0 



aaaacaaaaa gaagaagccc aaagacgagc 
tgtgcagcag gaggacgagc atcagagacc 
aagggaggag agtgtcaaaa agaaggatgg 
caagctgggt cggcttttcc atcaacacca 
ctggtccact gcagaaaaag gtccacttac 
gtattaccgg gcactgtacc cctttgaatc 
aggagacata gtcatggtgg atgaaagcca 
attaaaagga aagacagggt ggttccctgc 
ggttcccgct ccagtgaaac cagtgactga 
cttgcgtgag acccccgccc ctttggcagt 
taactgggcc gacttcagct ccacgtggcc 
taactgggat gcatgggcag cccagccctc 
gcagaggtcc gcctttactc cagccacggc 
ccagggtgaa aaggtggagg ggctacaagc 
agacaaccac ttaaatttta acaaaaatga 
gtggtggttt ggagaagttc aaggtcagaa 
catttcaggg cccataagga agtctacaag 
tagtctaaag cgagtagcct ctccagcagc 
ccaggttatt gcctcataca ccgccaccgg 
gctgattttg atccgaaaaa agaacccagg 
tgggaaaaag cgccagatag gctggttccc 
gacgagcaaa atcactccaa cagagccacc 
ggtgattggg atgtacgact acaccgcgca 
ccagatcatc aacgtcctca acaaggagga 
acaagtgggg ctcttcccat ccaattatgt 
gcaatgaatc atatgttgtc catccccccc 
tagttactca aattgacttt cccccacctt 
atttttaaat atatatttta gctttttaat 
ttttggtttt gcaaaaagac ccactatcaa 
ccaaatgtcc ataaatctga gacttgatgt 
aaattgtgca gtttggggct tttccccctt 
ctgttttaaa gacgtataga atgagcccaa 
tgtatcagct gtaccttgtt gagcatgtaa 
ccatggcaaa gctattacct tgtacgatgc 
cacagtgacc ttgtagccac atgagaaagc 
tctggttgag ttggtgtttt gtttggggtt 
tcagtagaca acaccactga ggtcgttacg 
tctgttacat gaagttttat tccagttact 
attttcttga caagaaagaa tgtatagaag 
ttttttaact agactgtgga atttctacag 
tgtgtgttag atatgctgta gctgaagccc 
ctcaataaaa atgcctgctg ctcacagcac 
aatctagctg tcctcctaaa gactctgtaa 
gtcgggaggc tgtgctggtg gtcgtgtaag 
gaggacctct cctcctcgtt cagttgcact 
atataaatat ataaacctgc ggctttaaca 
tgtatagata attaaattct tcatacaaaa 

<210> 77 
<211> 641 
<212> PRT 

<213> Homo sapiens 



tcaggaaagg gacaagcagt ggctggagca 480 
aagaaaactc cacgaagagg aaaaactgaa 540 
cgaggaaaaa ggcaaacagg aagcacaaga 60 0 
agaaccagct aagccagctg tccaggcacc 660 
catttctgca caggaaaatg taaaagtggt 72 0 
cagaagccat gatgaaatca ctatccagcc 7 80 
aactggagaa cccggctggc ttggaggaga 840 
aaactatgca gagaaaatcc cagaaaatga 900 
ttcaacatct gcccctgccc ccaaactggc 960 
aacctcttca gagccctcca cgacccctaa 1020 
caccagcacg aatgagaaac cagaaacgga 1080 
tctcaccgtt ccaagtgccg gccagttaag 1140 
cactggctcc tccccgtctc ctgtgctagg 12 0 0 
tcaagcccta tatccttgga gagccaaaaa 1260 
tgtcatcacc gtcctggaac agcaagacat 132 0 
gggttggttc cccaagtctt acgtgaaact 1380 
catggattct ggttcttcag agagtcctgc 1440 
caagccggtc gtttcgggag aagaaattgc 1500 
ccccgagcag ctcactctcg cccctggtca 1560 
tggatggtgg gaaggagagc tgcaagcacg 1620 
agctaattat gtaaagcttc taagccctgg 168 0 
taagtcaaca gcattagcgg cagtgtgcca 1740 
gaatgacgat gagctggcct tcaacaaggg 18 00 
ccctgactgg tggaaaggag aagtcaatgg 1860 
gaagctgacc acagacatgg acccaagcca 192 0 
tcaggcttga aagtcctttt gtggctttcc 198 0 
tgcacaggtg ctttcaatag ttttaaaatt 204 0 
aaacaaaata aataaatgac ttctttgcta 2100 
ggaatgctgc atgtgctatt aaaaattgtt 2160 
attttttcat tttgtccagt gttaccaact 2220 
accatagaag tgcagaggag ttcagtatct 22 80 
ttaaagcgaa ggtgtttgtg cttgtttgtg 234G 
tacatcctgt acataagaaa ttagttcttt 240 0 
tctaatcata ttgcatttaa ttttattttg 2460 
actctgtgtt tttgttcggt ctcagattta 252 0 
tttaattttg cgtgtttgca tagcataaaa 2580 
atcaacgata tccacagtct ctttttagtc 2640 
tttcatggaa tgacctattt tgaacaagta 2700 
tctccctgca attaatttcc aatgtttaca 2760 
attaatatga aatggagctc atggtccgtt 2 82 0 
tgtttgtctt ttaaacacta gttggaagct 2 88 0 
agaaaatggg gcagggggag cctcaagcac 2 94 0 
tgctcactcc cctcgcgttc tcccggcgct 3000 
gtccttctcc tttcacatgg tgcagagagc 3 060 
tcagtatttt cacggatatg aatgtaaaat 312 0 
actgtaatac aaccttttga attagttccg 3180 
gttaaaaaaa aaaaaaaaaa a 3231 



<400> 77 

Thr Thr Gin Arg Gin Glu He Glu Ser Thr Asn Lys Ser Arg Glu Leu 
15 10 15 



Arg He Ala Glu He Thr His Leu Gin Gin Gin Leu Gin Glu Ser Gin 
20 25 30 

Gin Met Leu Gly Arg Leu He Pro Glu Lys Gin He Leu Asn Asp Gin 

35 40 45 

Leu Lys Gin Val Gin Gin Asn Ser Leu His Arg Asp Ser Leu Val Thr 
50 55 60 

Leu Lys Arg Ala Leu Glu Ala Lys Glu Leu Ala Arg Gin His Leu Arg 
65 70 75 80 

Asp Gin Leu Asp Glu Val Glu Lys Glu Thr Arg Ser Lys Leu Gin Glu 
85 90 95 

He Asp He Phe Asn Asn Gin Leu Lys Glu Leu Arg Glu He His Asn 
100 105 HO 

Lys Gin Gin Leu Gin Lys Gin Lys Ser Met Glu Ala Glu Arg Leu Lys 
115 120 125 

Gin Lys Glu Gin Glu Arg Lys He He Glu Leu Glu Lys Gin Lys Glu 
130 135 140 

Glu Ala Gin Arg Arg Ala Gin Glu Arg Asp Lys Gin Trp Leu Glu His 
145 150 155 ISO 

Val Gin Gin Glu Asp Glu His Gin Arg Pro Arg Lys Leu His Glu Glu 
ri 165 170 175 

Glu Lys Leu Lys Arg Glu Glu Ser Val Lys Lys Lys Asp Gly Glu Glu 

8 180 185 190 

Pi 
M 

N Lys Gly Lys Gin Glu Ala Gin Asp Lys Leu Gly Arg Leu Phe His Gin 
195 200 205 



M 

fj. His Gin Glu Pro Ala Lys Pro Ala Val Gin Ala Pro Trp Ser Thr Ala 
210 215 220 

Glu Lys Gly Pro Leu Thr He Ser Ala Gin Glu Asn Val Lys Val Val 
225 230 235 240 

Tyr Tyr Arg Ala Leu Tyr Pro Phe Glu Ser Arg Ser His Asp Glu He 
245 250 255 

Thr He Gin Pro Gly Asp He Val Met Val Asp Glu Ser Gin Thr Gly 
260 265 270 

Glu Pro Gly Trp Leu Gly Gly Glu Leu Lys Gly Lys Thr Gly Trp Phe 
275 280 285 

Pro Ala Asn Tyr Ala Glu Lys He Pro Glu Asn Glu Val Pro Ala Pro 
290 295 300 



Val Lys Pro Val Thr Asp Ser Thr Ser Ala Pro Ala Pro Lys Leu Ala 
305 310 315 320 



Leu Arg Glu Thr Pro Ala Pro Leu Ala Val Thr Ser Ser Glu Pro Ser 
325 330 335 

Thr Thr Pro Asn Asn Trp Ala Asp Phe Ser Ser Thr Trp Pro Thr Ser 
340 345 350 

Thr Asn Glu Lys Pro Glu Thr Asp Asn Trp Asp Ala Trp Ala Ala Gin 
355 360 365 

Pro Ser Leu Thr Val Pro Ser Ala Gly Gin Leu Arg Gin Arg Ser Ala 
370 375 380 

Phe Thr Pro Ala Thr Ala Thr Gly Ser Ser Pro Ser Pro Val Leu Gly 
385 390 395 400 

Gin Gly Glu Lys Val Glu Gly Leu Gin Ala Gin Ala Leu Tyr Pro Trp 
405 410 415 

Arg Ala Lys Lys Asp Asn His Leu Asn Phe Asn Lys Asn Asp Val lie 
420 425 430 

Thr Val Leu Glu Gin Gin Asp Met Trp Trp Phe Gly Glu Val Gin Gly 
435 440 445 

Gin Lys Gly Trp Phe Pro Lys Ser Tyr Val Lys Leu lie Ser Gly Pro 
450 455 460 

lie Arg Lys Ser Thr Ser Met Asp Ser Gly Ser Ser Glu Ser Pro Ala 
465 470 475 480 

Ser Leu Lys Arg Val Ala Ser Pro Ala Ala Lys Pro Val Val Ser Gly 
485 490 495 

Glu Glu lie Ala Gin Val lie Ala Ser Tyr Thr Ala Thr Gly Pro Glu 
500 505 510 

Gin Leu Thr Leu Ala Pro Gly Gin Leu lie Leu lie Arg Lys Lys Asn 
515 520 525 

Pro Gly Gly Trp Trp Glu Gly Glu Leu Gin Ala Arg Gly Lys Lys Arg 
530 535 540 

Gin lie Gly Trp Phe Pro Ala Asn Tyr Val Lys Leu Leu Ser Pro Gly 
545 550 555 560 

Thr Ser Lys lie Thr Pro Thr Glu Pro Pro Lys Ser Thr Ala Leu Ala 
565 570 575 

Ala Val Cys Gin Val lie Gly Met Tyr Asp Tyr Thr Ala Gin Asn Asp 
580 585 590 

Asp Glu Leu Ala Phe Asn Lys Gly Gin lie lie Asn Val Leu Asn Lys 
595 600 605 

Glu Asp Pro Asp Trp Trp Lys Gly Glu Val Asn Gly Gin Val Gly Leu 
610 615 620 



Phe Pro Ser Asn Tyr Val Lys Leu Thr Thr Asp Met Asp Pro Ser Gin 
625 630 635 640 



Gin 



<210> 78 
<211> 641 
<212> PRT 

<213> Homo sapiens 
<400> 78 

Thr Thr Gin Arg Gin Glu lie Glu Ser Thr Asn Lys Ser Arg Glu Leu 
15 10 15 

Arg lie Ala Glu lie Thr His Leu Gin Gin Gin Leu Gin Glu Ser Gin 
20 25 30 

Gin Met Leu Gly Arg Leu lie Pro Glu Lys Gin lie Leu Asn Asp Gin 
35 40 45 

Leu Lys Gin Val Gin Gin Asn Ser Leu His Arg Asp Ser Leu Val Thr 
50 55 60 

Leu Lys Arg Ala Leu Glu Ala Lys Glu Leu Ala Arg Gin His Leu Arg 
65 70 75 80 

Asp Gin Leu Asp Glu Val Glu Lys Glu Thr Arg Ser Lys Leu Gin Glu 
85 90 95 

lie Asp lie Phe Asn Asn Gin Leu Lys Glu Leu Arg Glu lie His Asn 
100 105 110 

Lys Gin Gin Leu Gin Lys Gin Lys Ser Met Glu Ala Glu Arg Leu Lys 
115 120 125 

Gin Lys Glu Gin Glu Arg Lys lie lie Glu Leu Glu Lys Gin Lys Glu 
130 135 140 

Glu Ala Gin Arg Arg Ala Gin Glu Arg Asp Lys Gin Trp Leu Glu His 
145 150 155 160 

Val Gin Gin Glu Asp Glu His Gin Arg Pro Arg Lys Leu His Glu Glu 
165 170 175 

Glu Lys Leu Lys Arg Glu Glu Ser Val Lys Lys Lys Asp Gly Glu Glu 
180 185 190 

Lys Gly Lys Gin Glu Ala Gin Asp Lys Leu Gly Arg Leu Phe His Gin 
195 200 205 

His Gin Glu Pro Ala Lys Pro Ala Val Gin Ala Pro Trp Ser Thr Ala 
210 215 220 



Glu Lys Gly Pro Leu Thr lie Ser Ala Gin Glu Asn Val Lys Val Val 
225 230 235 240 



7 



Tyr Tyr Arg Ala Leu Tyr Pro Phe Glu Ser Arg Ser His Asp Glu lie 
245 250 255 

Thr lie Gin Pro Gly Asp He Val Met Val Asp Glu Ser Gin Thr Gly 
260 265 270 

Glu Pro Gly Trp Leu Gly Gly Glu Leu Lys Gly Lys Thr Gly Trp Phe 
275 280 285 

Pro Ala Asn Tyr Ala Glu Lys He Pro Glu Asn Glu Val Pro Ala Pro 
290 295 300 

Val Lys Pro Val Thr Asp Ser Thr Ser Ala Pro Ala Pro Lys Leu Ala 
305 310 315 320 

Leu Arg Glu Thr Pro Ala Pro Leu Ala Val Thr Ser Ser Glu Pro Ser 
325 330 335 

Thr Thr Pro Asn Asn Trp Ala Asp Phe Ser Ser Thr Trp Pro Thr Ser 
340 345 350 

Thr Asn Glu Lys Pro Glu Thr Asp Asn Trp Asp Ala Trp Ala Ala Gin 
355 360 365 

Pro Ser Leu Thr Val Pro Ser Ala Gly Gin Leu Arg Gin Arg Ser Ala 
370 375 380 

Phe Thr Pro Ala Thr Ala Thr Gly Ser Ser Pro Ser Pro Val Leu Gly 
385 390 395 400 

Gin Gly Glu Lys Val Glu Gly Leu Gin Ala Gin Ala Leu Tyr Pro Trp 
405 410 415 

Arg Ala Lys Lys Asp Asn His Leu Asn Phe Asn Lys Asn Asp Val He 
420 425 430 

Thr Val Leu Glu Gin Gin Asp Met Trp Trp Phe Gly Glu Val Gin Gly 
435 440 445 

Gin Lys Gly Trp Phe Pro Lys Ser Tyr Val Lys Leu He Ser Gly Pro 
450 455 460 

He Arg Lys Ser Thr Ser Met Asp Ser Gly Ser Ser Glu Ser Pro Ala 
465 470 475 480 

Ser Leu Lys Arg Val Ala Ser Pro Ala Ala Lys Pro Val Val Ser Gly 
485 490 495 

Glu Glu He Ala Gin Val He Ala Ser Tyr Thr Ala Thr Gly Pro Glu 
500 505 510 

Gin Leu Thr Leu Ala Pro Gly Gin Leu He Leu He Arg Lys Lys Asn 
515 520 525 



Pro Gly Gly Trp Trp Glu Gly Glu Leu Gin Ala Arg Gly Lys Lys Arg 
530 535 540 



Gin lie Gly Trp 
545 

Thr Ser Lys lie 



Ala Val Cys Gin 
580 

Asp Glu Leu Ala 
595 

Glu Asp Pro Asp 
610 

Phe Pro Ser Asn 
625 

Gin 



Phe Pro Ala Asn 
550 

Thr Pro Thr Glu 
565 

Val He Gly Met 



Phe Asn Lys Gly 
600 

Trp Trp Lys Gly 
615 

Tyr Val Lys Leu 
630 



Tyr Val Lys Leu 
555 

Pro Pro Lys Ser 
570 

Tyr Asp Tyr Thr 
585 

Gin He He Asn 



Glu Val Asn Gly 
620 

Thr Thr Asp Met 
635 



Leu Ser Pro Gly 
560 

Thr Ala Leu Ala 
575 

Ala Gin Asn Asp 
590 

Val Leu Asn Lys 
605 

Gin Val Gly Leu 



Asp Pro Ser Gin 
640 
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<400> 80 

Lys Ser Phe Cys Gly Phe Pro Ser Tyr Ser Asn 
15 10 



<210> 81 
<211> 30 
<212> PRT 

<213> Homo sapiens 
<400> 81 

Leu Ser Pro Thr Phe Ala Gin Val Leu Ser He Val Leu Lys Leu Phe 
15 10 15 

Leu Asn He Tyr Phe Ser Phe Leu He Asn Lys He Asn Lys 
20 25 30 



<210> 82 



<211> 20 
<212> PRT 

<213> Homo sapiens 
<400> 82 

Leu Leu Cys Tyr Phe Gly Phe Ala Lys Arg Pro Thr lie Lys Glu Cys 
15 10 15 

Cys Met Cys Tyr 
20 



<210> 83 
<211> 34 
<212> PRT 

<213> Homo sapiens 
<400> 83 

Lys Leu Phe Gin Met Ser He Asn Leu Arg Leu Asp Val Phe Phe His 
15 10 15 

Phe Val Gin Cys Tyr Gin Leu Asn Cys Ala Val Trp Gly Phe Ser Pro 

0 20 25 30 

'^4 Leu Pro 

m \ 

jl; <210> 84 
<211> 13 
* <212> PRT 

1 <213> Homo sapiens 

Q 

b* <400> 84 

Q Lys Cys Arg Gly Val Gin Tyr Leu Cys Phe Lys Asp Val 

Of 1 5 10 

M 

s 

<210> 85 

<211> 4 

<212> PRT 

<213> Homo sapiens 

<400> 85 
Asn Glu Pro Asn 
1 



<210> 86 
<211> 15 
<212> PRT 
<213> Homo 



sapiens 



<400> 86 

Ser Glu Gly Val Cys Ala Cys Leu Cys Val Ser Ala Val Pro Cys 
15 10 15 



<210> 87 
<211> 7 
<212> PRT 

<213> Homo sapiens 
<400> 87 

Ala Cys Asn Thr Ser Cys Thr 
1 5 



<210> 88 
<211> 29 
<212> PRT 

<213> Homo sapiens 
<400> 88 

Glu lie Ser Ser Phe His Gly Lys Ala lie Thr Leu Tyr Asp Ala Leu 
15 10 15 

lie lie Leu His Leu lie Leu Phe Cys Thr Val Thr Leu 
20 25 



<210> 89 
<211> 33 
<212> PRT 

<213> Homo sapiens 
<400> 89 

Pro His Glu Lys Ala Leu Cys Val Phe Val Arg Ser Gin lie Tyr Leu 
15 10 15 

Val Glu Leu Val Phe Cys Leu Gly Phe Leu lie Leu Arg Val Cys, lie 
20 25 30 

Ala 



<210> 90 
<211> 2 
<212> PRT 

<213> Homo sapiens 

<400> 90 
Asn Gin 
1 



<210> 91 
<211> 16 
<212> PRT 

<213> Homo sapiens 



<400> 91 

Thr Thr Pro Leu Arg Ser Leu Arg Ser Thr lie Ser Thr Val Ser Phe 



1 



5 



10 



15 



<210> 92 
<211> 14 
<212> PRT 

<213> Homo sapiens 
<400> 92 

Ser Leu Leu His Glu Val Leu Phe Gin Leu Leu Phe Met Glu 
15 10 



<210> 93 
<211> 5 
<212> PRT 

<213> Homo sapiens 
<400> 93 

Pro lie Leu Asn Lys 
1 5 



mi 

W 

a 

N <210> 95 
Q <211> 29 
rij <212> PRT 

<213> Homo sapiens 



<210> 94 

<211> 2 

<212> PRT 

<213> Homo sapiens 

<400> 94 
Phe Ser 
1 



<400> 95 

Gin Glu Arg Met Tyr Arg Ser Leu Pro Ala lie Asn Phe Gin Cys Leu 
15 10 15 

His Phe Leu Thr Arg Leu Trp Asn Phe Tyr Arg Leu lie 
20 25 



<210> 96 
<211> 9 
<212> PRT 

<213> Homo sapiens 
<400> 96 

Asn Gly Ala His Gly Pro Phe Val Cys 
1 5 



<210> 97 



<211> 4 
<212> PRT 

<213> Homo sapiens 

<400> 97 
lie Cys Cys Ser 
1 



<210> 98 
<211> 33 
<212> PRT 

<213> Homo sapiens 
<400> 98 

Ser Pro Val Cys Leu Leu Asn Thr Ser Trp Lys Leu Ser lie Lys Met 
15 10 15 

Pro Ala Ala His Ser Thr Glu Asn Gly Ala Gly Gly Ala Ser Ser Thr 
20 25 30 

lie 



<210> 99 

<211> 3 

<212> PRT 

<213> Homo sapiens 

<400> 99 
Leu Ser Ser 
1 



<210> 100 
<211> 62 
<212> PRT 

<213> Homo sapiens 
<400> 100 

Arg Leu Cys Asn Ala His Ser Pro Arg Val Leu Pro Ala Leu Ser Gly 
15 10 15 

Gly Cys Ala Gly Gly Arg Val Arg Ser Phe Ser Phe His Met Val Gin 
20 25 30 

Arg Ala Arg Thr Ser Pro Pro Arg Ser Val Ala Leu Gin Tyr Phe His 
35 40 45 

Gly Tyr Glu Cys Lys lie Tyr Lys Tyr lie Asn Leu Arg Leu 
50 55 60 



<210> 101 
<211> 2 
<212> PRT 



<213> Homo sapiens 



<400> 101 
Gin Leu 
1 



<210> 102 
<211> 5 
<212> PRT 

<213> Homo sapiens 
<400> 102 

Tyr Asn Leu Leu Asn 
1 5 



<210> 103 
<211> 3 
<212> PRT 

<213> Homo sapiens 

<400> 103 
Phe Arg Val 
1 



<210> 104 
<211> 14 
<212> PRT 

<213> Homo sapiens 
<220> 

<223> From Seq ID 78 to ID 104, there are 27 pretein 

sequences translated from Seq ID No. 76. Together, 
they form the whole protein sequence. 

<400> 104 

lie lie Lys Phe Phe lie Gin Lys Leu Lys Lys Lys Lys Lys 
15 10 



<210> 105 
<211> 1721 
<212> PRT 

<213> Homo sapiens 
<400> 105 

Met Ala Gin Phe Pro Thr Pro Phe Gly Gly Ser Leu Asp He Trp Ala 
15 10 15 

He Thr Val Glu Glu Arg Ala Lys His Asp Gin Gin Phe His Ser Leu 
20 25 30 

Lys Pro He Ser Gly Phe He Thr Gly Asp Gin Ala Arg Asn Phe Phe 
35 40 45 



Phe Gin Ser Gly Leu Pro Gin Pro Val Leu Ala Gin lie Trp Ala Leu 
50 55 60 



Ala Asp Met Asn Asn Asp Gly Arg Met Asp Gin Val Glu Phe Ser He 

65 70 75 80 

Ala Met Lys Leu He Lys Leu Lys Leu Gin Gly Tyr Gin Leu Pro Ser 
85 90 95 

Ala Leu Pro Pro Val Met Lys Gin Gin Pro Val Ala He Ser Ser Ala 
100 105 110 

Pro Pro Phe Gly Met Gly Gly He Ala Ser Met Pro Pro Leu Thr Ala 
115 120 125 

Val Ala Pro Val Pro Met Gly Ser He Pro Val Val Gly Met Ser Pro 
130 135 140 

Thr Leu Val Ser Ser Val Pro Thr Ala Ala Val Pro Pro Leu Ala Asn 
145 150 155 160 

Gly Ala Pro Pro Val He Gin Pro Leu Pro Ala Phe Ala His Pro Ala 
165 170 175 

Ala Thr Leu Pro Lys Ser Ser Ser Phe Ser Arg Ser Gly Pro Gly Ser 
180 185 190 

Gin Leu Asn Thr Lys Leu Gin Lys Ala Gin Ser Phe Asp Val Ala Ser 
195 200 205 

Val Pro Pro Val Ala Glu Trp Ala Val Pro Gin Ser Ser Arg Leu Lys 
210 215 220 

Tyr Arg Gin Leu Phe Asn Ser His Asp Lys Thr Met Ser Gly His Leu 
225 230 235 240 

Thr Gly Pro Gin Ala Arg Thr He Leu Met Gin Ser Ser Leu Pro Gin 
245 250 255 

Ala Gin Leu Ala Ser He Trp Asn Leu Ser Asp He Asp Gin Asp Gly 
260 265 270 

Lys Leu Thr Ala Glu Glu Phe He Leu Ala Met His Leu He Asp Val 
275 280 285 

Ala Met Ser Gly Gin Pro Leu Pro Pro Val Leu Pro Pro Glu Tyr He 
290 295 300 

Pro Pro Ser Phe Arg Arg Val Arg Ser Gly Ser Gly He Ser Val He 
305 310 315 320 

Ser Ser Thr Ser Val Asp Gin Arg Leu Pro Glu Glu Pro Val Leu Glu 
325 330 335 



Asp Glu Gin Gin Gin Leu Glu Lys Lys Leu Pro Val Thr Phe Glu Asp 
340 345 350 



Lys Lys Arg Glu Asn Phe Glu Arg Gly Asn Leu Glu Leu Glu Lys Arg 
355 360 365 

Arg Gin Ala Leu Leu Glu Gin Gin Arg Lys Glu Gin Glu Arg Leu Ala 
. 370 375 380 

Gin Leu Glu Arg Ala Glu Gin Glu Arg Lys Glu Arg Glu Arg Gin Glu 
385 390 395 400 

Gin Glu Arg Lys Arg Gin Leu Glu Leu Glu Lys Gin Leu Glu Lys Gin 
405 410 415 

Arg Glu Leu Glu Arg Gin Arg Glu Glu Glu Arg Arg Lys Glu He Glu 
420 425 430 

Arg Arg Glu Ala Ala Lys Arg Glu Leu Glu Arg Gin Arg Gin Leu Glu 
435 440 445 

Trp Glu Arg Asn Arg Arg Gin Glu Leu Leu Asn Gin Arg Asn Lys Glu 
450 455 460 

Gin Glu Asp He Val Val Leu Lys Ala Lys Lys Lys Thr Leu Glu Phe 
465 470 475 480 

Glu Leu Glu Ala Leu Asn Asp Lys Lys His Gin Leu Glu Gly Lys Leu 
485 490 495 

Gin Asp He Arg Cys Arg Leu Thr Thr Gin Arg Gin Glu He Glu Ser 
500 505 510 

Thr Asn Lys Ser Arg Glu Leu Arg He Ala Glu He Thr His Leu Gin 
515 520 525 

Gin Gin Leu Gin Glu Ser Gin Gin Met Leu Gly Arg Leu He Pro Glu 
530 535 540 

Lys Gin He Leu Asn Asp Gin Leu Lys Gin Val Gin Gin Asn Ser Leu 
545 550 555 560 

His Arg Asp Ser Leu Val Thr Leu Lys Arg Ala Leu Glu Ala Lys Glu 
565 570 575 

Leu Ala Arg Gin His Leu Arg Asp Gin Leu Asp Glu Val Glu Lys Glu 
580 585 590 

Thr Arg Ser Lys Leu Gin Glu He Asp He Phe Asn Asn Gin Leu Lys 
595 600 605 

Glu Leu Arg Glu He His Asn Lys Gin Gin Leu Gin Lys Gin Lys Ser 
610 615 620 

Met Glu Ala Glu Arg Leu Lys Gin Lys Glu Gin Glu Arg Lys He He 
625 630 635 640 



Glu Leu Glu Lys Gin Lys Glu Glu Ala Gin Arg Arg Ala Gin Glu Arg 
645 650 655 



Asp Lys Gin Trp Leu Glu His Val Gin Gin Glu Asp Glu His Gin Arg 
660 665 670 



Pro Arg Lys Leu His Glu Glu Glu Lys Leu Lys Arg Glu Glu Ser Val 
675 680 685 

Lys Lys Lys Asp Gly Glu Glu Lys Gly Lys Gin Glu Ala Gin Asp Lys 
690 695 700 

Leu Gly Arg Leu Phe His Gin His Gin Glu Pro Ala Lys Pro Ala Val 
705 710 715 720 

Gin Ala Pro Trp Ser Thr Ala Glu Lys Gly Pro Leu Thr He Ser Ala 
725 730 735 

Gin Glu Asn Val Lys Val Val Tyr Tyr Arg Ala Leu Tyr Pro Phe Glu 
740 745 750 

Ser Arg Ser His Asp Glu He Thr He Gin Pro Gly Asp He Val Met 
755 760 765 

Val Lys Gly Glu Trp Val Asp Glu Ser Gin Thr Gly Glu Pro Gly Trp 
770 775 780 

Leu Gly Gly Glu Leu Lys Gly Lys Thr Gly Trp Phe Pro Ala Asn Tyr 
785 790 795 800 

Ala Glu Lys He Pro Glu Asn Glu Val Pro Ala Pro Val Lys Pro Val 
805 810 815 

Thr Asp Ser Thr Ser Ala Pro Ala Pro Lys Leu Ala Leu Arg Glu Thr 
820 825 830 

Pro Ala Pro Leu Ala Val Thr Ser Ser Glu Pro Ser Thr Thr Pro Asn 
835 840 845 

Asn Trp Ala Asp Phe Ser Ser Thr Trp Pro Thr Ser Thr Asn Glu Lys 
850 855 860 

Pro Glu Thr Asp Asn Trp Asp Ala Trp Ala Ala Gin Pro Ser Leu Thr 
865 870 875 880 

Val Pro Ser Ala Gly Gin Leu Arg Gin Arg Ser Ala Phe Thr Pro Ala 
885 890 895 

Thr Ala Thr Gly Ser Ser Pro Ser Pro Val Leu Gly Gin Gly Glu Lys 
900 905 910 

Val Glu Gly Leu Gin Ala Gin Ala Leu Tyr Pro Trp Arg Ala Lys Lys 
915 920 925 

Asp Asn His Leu Asn Phe Asn Lys Asn Asp Val He Thr Val Leu Glu 
930 935 940 



Gin Gin Asp Met Trp Trp Phe Gly Glu Val Gin Gly Gin Lys Gly Trp 
945 950 955 960 



Phe Pro Lys Ser Tyr Val Lys Leu He Ser Gly Pro He Arg Lys Ser 
965 970 975 



Thr Ser Met Asp Ser Gly Ser Ser Glu Ser Pro Ala Ser Leu Lys Arg 
980 985 990 

Val Ala Ser Pro Ala Ala Lys Pro Val Val Ser Gly Glu Glu Phe He 
995 1000 1005 

Ala Met Tyr Thr Tyr Glu Ser Ser Glu Gin Gly Asp Leu Thr Phe Gin 
1010 1015 1020 

Gin Gly Asp Val He Leu Val Thr Lys Lys Asp Gly Asp Trp Trp Thr 
1025 1030 1035 1040 

Gly Thr Val Gly Asp Lys Ala Gly Val Phe Pro Ser Asn Tyr Val Arg 
1045 1050 1055 

Leu Lys Asp Ser Glu Gly Ser Gly Thr Ala Gly Lys Thr Gly Ser Leu 
1060 1065 1070 

Gly Lys Lys Pro Glu He Ala Gin Val He Ala Ser Tyr Thr Ala Thr 
1075 1080 1085 

Gly Pro Glu Gin Leu Thr Leu Ala Pro Gly Gin Leu He Leu He Arg 
1090 1095 1100 

Lys Lys Asn Pro Gly Gly Trp Trp Glu Gly Glu Leu Gin Ala Arg Gly 
1105 1110 1115 1120 

Lys Lys Arg Gin He Gly Trp Phe Pro Ala Asn Tyr Val Lys Leu Leu 
1125 1130 1135 

Asn Pro Gly Thr Ser Lys He Thr Pro Thr Glu Pro Pro Lys Ser Thr 
1140 1145 1150 

Ala Leu Ala Ala Val Cys Gin Val He Gly Met Tyr Asp Tyr Thr Ala 
1155 1160 1165 

Gin Asn Asp Asp Glu Leu Ala Phe Asn Lys Gly Gin He He Asn Val 
1170 1175 1180 

Leu Asn Lys Glu Asp Pro Asp Trp Trp Lys Gly Glu Val Asn Gly Gin 
1185 1190 1195 , 1200 

Val Gly Leu Phe Pro Ser Asn Tyr Val Lys Leu Thr Thr Asp Met Asp 
1205 1210 1215 

Pro Ser Gin Gin Trp Cys Ser Asp Leu His Leu Leu Asp Met Leu Thr 
1220 1225 1230 

Pro Thr Glu Arg Lys Arg Gin Gly Tyr He His Glu Leu He Val Thr 
1235 1240 1245 



Glu Glu Asn Tyr Val Asn Asp Leu Gin Leu Val Thr Glu He Phe Gin 
1250 1255 1260 



Lys Pro Leu Met Glu Ser Glu Leu Leu Thr Glu Lys Glu Val Ala Met 
1265 1270 1275 1280 



He Phe Val Asn Trp Lys Glu Leu He Met Cys Asn He Lys Leu Leu 
1285 1290 1295 

Lys Ala Leu Arg Val Arg Lys Lys Met Ser Gly Glu Lys Met Pro Val 
1300 1305 1310 

Lys Met He Gly Asp He Leu Ser Ala Gin Leu Pro His Met Gin Pro 
1315 1320 1325 

Tyr He Arg Phe Cys Ser Arg Gin Leu Asn Gly Ala Ala Leu He Gin 
1330 1335 1340 

Gin Lys Thr Asp Glu Ala Pro Asp Phe Lys Glu Phe Val Lys Arg Leu 
1345 1350 1355 1360 

Glu Met Asp Pro Arg Cys Lys Gly Met Pro Leu Ser Ser Phe He Leu 
1365 1370 1375 

Lys Pro Met Gin Arg Val Thr Arg Tyr Pro Leu He He Lys Asn He 
1380 1385 1390 

Leu Glu Asn Thr Pro Glu Asn His Pro Asp His Ser His Leu Lys His 
1395 1400 1405 

Ala Leu Glu Lys Ala Glu Glu Leu Cys Ser Gin Val Asn Glu Gly Val 
1410 1415 1420 

Arg Glu Lys Glu Asn Ser Asp Arg Leu Glu Trp He Gin Ala His Val 
1425 1430 1435 1440 

Gin Cys Glu Gly Leu Ser Glu Gin Leu Val Phe Asn Ser Val Thr Asn 
1445 1450 1455 

Cys Leu Gly Pro Arg Lys Phe Leu His Ser Gly Lys Leu Tyr Lys Ala 
1460 1465 1470 

Lys Asn Asn Lys Glu Leu Tyr Gly Phe Leu Phe Asn Asp Phe Leu Leu 
1475 1480 1485 

Leu Thr Gin He Thr Lys Pro Leu Gly Ser Ser Gly Thr Asp Lys Val 
1490 1495 1500 

Phe Ser Pro Lys Ser Asn Leu Gin Tyr Lys Met Tyr Lys Thr Pro He 
1505 1510 1515 1520 

Phe Leu Asn Glu Val Leu Val Lys Leu Pro Thr Asp Pro Ser Gly Asp 
1525 1530 1535 

Glu Pro He Phe His He Ser His He Asp Arg Val Tyr Thr Leu Arg 
1540 1545 1550 



Ala Glu Ser He Asn Glu Arg Thr Ala Trp Val Gin Lys He Lys Ala 
1555 1560 1565 



Ala Ser Glu Leu Tyr lie Glu Thr Glu Lys Lys Lys Arg Glu Lys Ala 
1570 1575 1580 



Tyr Leu Val Arg Ser Gin Arg Ala Thr Gly lie Gly Arg Leu Met Val 
1585 1590 1595 1600 

Asn Val Val Glu Gly lie Glu Leu Lys Pro Cys Arg Ser His Gly Lys 
1605 1610 1615 

Ser Asn Pro Tyr Cys Glu Val Thr Met Gly Ser Gin Cys His lie Thr 
1620 1625 1630 

Lys Thr He Gin Asp Thr Leu Asn Pro Lys Trp Asn Ser Asn Cys Gin 
1635 1640 1645 

Phe Phe He Arg Asp Leu Glu Gin Glu Val Leu Cys He Thr Val Phe 
1650 1655 1660 

Glu Arg Asp Gin Phe Ser Pro Asp Asp Phe Leu Gly Arg Thr Glu He 
1665 1670 1675 1680 

Arg Val Ala Asp He Lys Lys Asp Gin Gly Ser Lys Gly Pro Val Thr 
1685 1690 1695 

Lys Cys Leu Leu Leu His Glu Val Pro Thr Gly Glu He Val Val Arg 
1700 1705 1710 

Leu Asp Leu Gin Leu Phe Asp Glu Pro 
1715 1720 



<210> 106 
<211> 1220 
<212> PRT 

<213> Homo sapiens 
<400> 106 

Met Ala Gin Phe Pro Thr Pro Phe Gly Gly Ser Leu Asp He Trp Ala 
15 10 15 

He Thr Val Glu Glu Arg Ala Lys His Asp Gin Gin Phe His Ser Leu 
20 25 30 

Lys Pro He Ser Gly Phe He Thr Gly Asp Gin Ala Arg Asn Phe Phe 
35 40 45 

Phe Gin Ser Gly Leu Pro Gin Pro Val Leu Ala Gin He Trp Ala Leu 
50 55 60 

Ala Asp Met Asn Asn Asp Gly Arg Met Asp Gin Val Glu Phe Ser He 
65 70 75 80 

Ala Met Lys Leu He Lys Leu Lys Leu Gin Gly Tyr Gin Leu Pro Ser 
85 90 95 



Ala Leu Pro Pro Val Met Lys Gin Gin Pro Val Ala He Ser Ser Ala 
100 105 110 



Pro Pro Phe Gly Met Gly Gly He Ala Ser Met Pro Pro Leu Thr Ala 
115 120 125 

Val Ala Pro Val Pro Met Gly Ser He Pro Val Val Gly Met Ser Pro 
130 135 140 

Thr Leu Val Ser Ser Val Pro Thr Ala Ala Val Pro Pro Leu Ala Asn 
145 150 155 160 

Gly Ala Pro Pro Val He Gin Pro Leu Pro Ala Phe Ala His Pro Ala 
165 170 175 

Ala Thr Leu Pro Lys Ser Ser Ser Phe Ser Arg Ser Gly Pro Gly Ser 
180 185 190 

Gin Leu Asn Thr Lys Leu Gin Lys Ala Gin Ser Phe Asp Val Ala Ser 
195 200 205 

Val Pro Pro Val Ala Glu Trp Ala Val Pro Gin Ser Ser Arg Leu Lys 
210 215 220 

Tyr Arg Gin Leu Phe Asn Ser His Asp Lys Thr Met Ser Gly His Leu 
225 230 235 240 

Thr Gly Pro Gin Ala Arg Thr He Leu Met Gin Ser Ser Leu Pro Gin 
245 250 255 

Ala Gin Leu Ala Ser He Trp Asn Leu Ser Asp He Asp Gin Asp Gly 
260 265 270 

Lys Leu Thr Ala Glu Glu Phe He Leu Ala Met His Leu He Asp Val 
275 280 285 

Ala Met Ser Gly Gin Pro Leu Pro Pro Val Leu Pro Pro Glu Tyr He 
290 295 300 

Pro Pro Ser Phe Arg Arg Val Arg Ser Gly Ser Gly He Ser Val He 
305 310 315 320 

Ser Ser Thr Ser Val Asp Gin Arg Leu Pro Glu Glu Pro Val Leu Glu 
325 330 335 

Asp Glu Gin Gin Gin Leu Glu Lys Lys Leu Pro Val Thr Phe Glu Asp 
340 345 350 

Lys Lys Arg Glu Asn Phe Glu Arg Gly Asn Leu Glu Leu Glu Lys Arg 
355 360 365 

Arg Gin Ala Leu Leu Glu Gin Gin Arg Lys Glu Gin Glu Arg Leu Ala 
370 375 380 

Gin Leu Glu Arg Ala Glu Gin Glu Arg Lys Glu Arg Glu Arg Gin Glu 
385 390 395 400 

Gin Glu Arg Lys Arg Gin Leu Glu Leu Glu Lys Gin Leu Glu Lys Gin 
405 410 415 



Arg Glu Leu Glu Arg Gin Arg Glu Glu Glu Arg Arg Lys Glu He Glu 
420 425 430 

Arg Arg Glu Ala Ala Lys Arg Glu Leu Glu Arg Gin Arg Gin Leu Glu 
435 440 445 

Trp Glu Arg Asn Arg Arg Gin Glu Leu Leu Asn Gin Arg Asn Lys Glu 
450 455 460 

Gin Glu Asp He Val Val Leu Lys Ala Lys Lys Lys Thr Leu Glu Phe 
465 470 475 480 

Glu Leu Glu Ala Leu Asn Asp Lys Lys His Gin Leu Glu Gly Lys Leu 
485 490 495 

Gin Asp He Arg Cys Arg Leu Thr Thr Gin Arg Gin Glu He Glu Ser 
500 505 510 

Thr Asn Lys Ser Arg Glu Leu Arg He Ala Glu He Thr His Leu Gin 
515 520 525 

Gin Gin Leu Gin Glu Ser Gin Gin Met Leu Gly Arg Leu He Pro Glu 
530 535 540 

Lys Gin He Leu Asn Asp Gin Leu Lys Gin Val Gin Gin Asn Ser Leu 
545 550 555 560 

His Arg Asp Ser Leu Val Thr Leu Lys Arg Ala Leu Glu Ala Lys Glu 
565 570 575 

Leu Ala Arg Gin His Leu Arg Asp Gin Leu Asp Glu Val Glu Lys Glu 
580 585 590 

Thr Arg Ser Lys Leu Gin Glu He Asp He Phe Asn Asn Gin Leu Lys 
595 600 605 

Glu Leu Arg Glu He His Asn Lys Gin Gin Leu Gin Lys Gin Lys Ser 
610 615 620 

Met Glu Ala Glu Arg Leu Lys Gin Lys Glu Gin Glu Arg Lys He He 
625 630 635 640 

Glu Leu Glu Lys Gin Lys Glu Glu Ala Gin Arg Arg Ala Gin Glu Arg 
645 650 655 

Asp Lys Gin Trp Leu Glu His Val Gin Gin Glu Asp Glu His Gin Arg 
660 665 670 

Pro Arg Lys Leu His Glu Glu Glu Lys Leu Lys Arg Glu Glu Ser Val 
675 680 685 

Lys Lys Lys Asp Gly Glu Glu Lys Gly Lys Gin Glu Ala Gin Asp Lys 
690 695 700 

Leu Gly Arg Leu Phe His Gin His Gin Glu Pro Ala Lys Pro Ala Val 
705 710 715 720 



Gin Ala Pro Trp Ser Thr Ala Glu Lys Gly Pro Leu Thr He Ser Ala 
725 730 735 

Gin Glu Asn Val Lys Val Val Tyr Tyr Arg Ala Leu Tyr Pro Phe Glu 
740 745 750 

Ser Arg Ser His Asp Glu He Thr He Gin Pro Gly Asp He Val Met 
755 760 765 

Val Lys Gly Glu Trp Val Asp Glu Ser Gin Thr Gly Glu Pro Gly Trp 
770 775 780 

Leu Gly Gly Glu Leu Lys Gly Lys Thr Gly Trp Phe Pro Ala Asn Tyr 
785 790 795 800 

Ala Glu Lys He Pro Glu Asn Glu Val Pro Ala Pro Val Lys Pro Val 
805 810 815 

Thr Asp Ser Thr Ser Ala Pro Ala Pro Lys Leu Ala Leu Arg Glu Thr 
820 825 830 

Pro Ala Pro Leu Ala Val Thr Ser Ser Glu Pro Ser Thr Thr Pro Asn 
835 840 845 

Asn Trp Ala Asp Phe Ser Ser Thr Trp Pro Thr Ser Thr Asn Glu Lys 
850 855 860 

Pro Glu Thr Asp Asn Trp Asp Ala Trp Ala Ala Gin Pro Ser Leu Thr 
865 870 875 880 

Val Pro Ser Ala Gly Gin Leu Arg Gin Arg Ser Ala Phe Thr Pro Ala 
885 890 895 

Thr Ala Thr Gly Ser Ser Pro Ser Pro Val Leu Gly Gin Gly Glu Lys 
900 905 910 

Val Glu Gly Leu Gin Ala Gin Ala Leu Tyr Pro Trp Arg Ala Lys Lys 
915 920 925 

Asp Asn His Leu Asn Phe Asn Lys Asn Asp Val He Thr Val Leu Glu 
930 935 940 

Gin Gin Asp Met Trp Trp Phe Gly Glu Val Gin Gly Gin Lys Gly Trp 
945 950 955 960 

Phe Pro Lys Ser Tyr Val Lys Leu He Ser Gly Pro He Arg Lys Ser 
965 970 975 

Thr Ser Met Asp Ser Gly Ser Ser Glu Ser Pro Ala Ser Leu Lys Arg 
980 985 990 

Val Ala Ser Pro Ala Ala Lys Pro Val Val Ser Gly Glu Glu Phe He 
995 1000 1005 

Ala Met Tyr Thr Tyr Glu Ser Ser Glu Gin Gly Asp Leu Thr Phe Gin 
1010 1015 1020 



Gin Gly Asp Val He Leu Val Thr Lys Lys Asp Gly Asp Trp Trp Thr 
1025 1030 1035 1040 

Gly Thr Val Gly Asp Lys Ala Gly Val Phe Pro Ser Asn Tyr Val Arg 
1045 1050 1055 

Leu Lys Asp Ser Glu Gly Ser Gly Thr Ala Gly Lys Thr Gly Ser Leu 
1060 1065 1070 

Gly Lys Lys Pro Glu He Ala Gin Val He Ala Ser Tyr Thr Ala Thr 
1075 1080 1085 

Gly Pro Glu Gin Leu Thr Leu Ala Pro Gly Gin Leu He Leu He Arg 
1090 1095 HOO 

Lys Lys Asn Pro Gly Gly Trp Trp Glu Gly Glu Leu Gin Ala Arg Gly 
1105 1110 1H5 I 120 

Lys Lys Arg Gin He Gly Trp Phe Pro Ala Asn Tyr Val Lys Leu Leu 
1125 1130 1135 

Q Asn Pro Gly Thr Ser Lys He Thr Pro Thr Glu Pro Pro Lys Ser Thr 
*fi 1140 1145 1150 

Nj 

flj Ala Leu Ala Ala Val Cys Gin Val He Gly Met Tyr Asp Tyr Thr Ala 

0 1155 1160 1165 

Gin Asn Asp Asp Glu Leu Ala Phe Asn Lys Gly Gin He He Asn Val 

J* 1170 1175 1180 

1 Leu Asn Lys Glu Asp Pro Asp Trp Trp Lys Gly Glu Val Asn Gly Gin 
CI 1185 1190 H95 1200 

0 Val Gly Leu Phe Pro Ser Asn Tyr Val Lys Leu Thr Thr Asp Met Asp 
pj 1205 1210 1215 

U Pro Ser Gin Gin 
1220 



<210> 107 
<211> 1270 
<212> PRT 

<213> Xenopus laevis 
<400> 107 

Met Ala Gin Phe Gly Thr Pro Phe 
1 5 

He Thr Val Glu Glu Arg Ala Lys 
20 

Lys Pro Thr Ala Gly Tyr He Thr 
35 40 

Leu Gin Ser Gly Leu Pro Gin Pro 



Gly Gly Asn Leu Asp He Trp Ala 
10 15 

His Asp Gin Gin Phe His Gly Leu 
25 30 

Gly Asp Gin Ala Arg Asn Phe Phe 
45 

Val Leu Ala Gin He Trp Ala Leu 



50 55 60 

Ala Asp Met Asn Asn Asp Gly Arg Met Asp Gin Leu Glu Phe Ser He 
65 70 75 80 

Ala Met Lys Leu He Lys Leu Lys Leu Gin Gly Tyr Pro Leu Pro Ser 
85 90 95 

He Leu Pro Ser Asn Met Leu Lys Gin Pro Val Ala Met Pro Ala Ala 
100 105 HO 

Ala Val Ala Gly Phe Gly Met Ser Gly He Val Gly He Pro Pro Leu 
115 120 125 

Ala Ala Val Ala Pro Val Pro Met Pro Ser He Pro Val Val Gly Met 
130 135 140 

Ser Pro Pro Leu Val Ser Ser Val Pro Thr Val Pro Pro Leu Ser Asn 
145 150 155 160 

Gly Ala Pro Ala Val He Gin Ser His Pro Ala Phe Ala His Ser Ala 
165 170 175 

Thr Leu Pro Lys Ser Ser Ser Phe Gly Arg Ser Val Ala Gly Ser Gin 
180 185 190 

He Asn Thr Lys Leu Gin Lys Ala Gin Ser Phe Asp Val Pro Ala Pro 
195 200 205 

Pro Leu Val Val Glu Trp Ala Val Pro Ser Ser Ser Arg Leu Lys Tyr 
210 215 220 

Arg Gin Leu Phe Asn Ser Gin Asp Lys Thr Met Ser Gly Asn Leu Thr 
225 230 235 240 

Gly Pro Gin Ala Arg Thr He Leu Met Gin Ser Ser Leu Pro Gin Ser 
245 250 255 

Gin Leu Ala Thr He Trp Asn Leu Ser Asp He Asp Gin Asp Gly Lys 
260 265 270 

Leu Thr Ala Glu Glu Phe He Leu Ala Met His Leu He Asp Val Ala 
275 280 285 

Met Ser Gly Gin Pro Leu Pro Pro He Leu Pro Pro Glu Tyr He Pro 
290 295 300 

Pro Ser Phe Arg Arg Val Arg Ser Gly Ser Gly Leu Ser He Met Ser 
305 310 315 320 

Ser Val Ser Val Asp Gin Arg Leu Pro Glu Glu Pro Glu Glu Glu Glu 
325 330 335 

Pro Gin Asn Ala Asp Lys Lys Leu Pro Val Thr Phe Glu Asp Lys Lys 
340 345 350 



Arg Glu Asn Phe Glu Arg Gly Asn Leu Glu Leu Glu Lys Arg Arg Gin 



355 



360 



365 



Ala Leu Leu Glu Gin Gin Arg Lys Glu Gin Glu Arg Leu Ala Gin Leu 
370 375 380 

Glu Arg Ala Glu Gin Glu Arg Lys Glu Arg Glu Arg Gin Asp Gin Glu 
385 390 395 400 

Arg Lys Arg Gin Gin Asp Leu Glu Lys Gin Leu Glu Lys Gin Arg Glu 
405 410 415 

Leu Glu Arg Gin Arg Glu Glu Glu Arg Arg Lys Glu lie Glu Arg Arg 
420 425 430 

Glu Ala Ala Lys Arg Glu Leu Glu Arg Gin Arg Gin Leu Glu Trp Glu 
435 440 445 

Arg Asn Arg Arg Gin Glu Leu Leu Asn Gin Arg Asn Arg Glu Gin Glu 
450 455 460 

Asp lie Val Val Leu Lys Ala Lys Lys Lys Thr Leu Glu Phe Glu Leu 
465 470 475 480 

Glu Ala Leu Asn Asp Lys Lys His Gin Leu Glu Gly Lys Leu Gin Asp 
485 490 495 

lie Arg Cys Arg Leu Thr Thr Gin Arg His Glu lie Glu Ser Thr Asn 
500 505 510 

Lys Ser Arg Glu Leu Arg lie Ala Glu lie Thr His Leu Gin Gin Gin 
515 520 525 

Leu Gin Glu Ser Gin Gin Leu Leu Gly Lys Met lie Pro Glu Lys Gin 
530 535 540 

Ser Leu lie Asp Gin Leu Lys Gin Val Gin Gin Asn Ser Leu His Arg 
545 550 555 560 

Asp Ser Leu Leu Thr Leu Lys Arg Ala Leu Glu Thr Lys Glu lie Gly 
565 570 575 

Arg Gin Gin Leu Arg Asp Gin Leu Asp Glu Val Glu Lys Glu Thr Arg 
580 585 590 

Ala Lys Leu Gin Glu lie Asp Val Phe Asn Asn Gin Leu Lys Glu Leu 
595 600 605 

Arg Glu Leu Tyr Asn Lys Gin Gin Phe Gin Lys Gin Gin Asp Phe Glu 
610 615 620 

Thr Glu Lys lie Lys Gin Lys Glu Leu Glu Arg Lys Thr Ser Glu Leu 
625 630 635 640 

Asp Lys Leu Lys Glu Glu Asp Lys Arg Arg Met Leu Glu Gin Asp Lys 
645 650 655 



Leu Trp Gin Asp Arg Val Lys Gin Glu Glu Glu Arg Tyr Lys Phe Gin 



660 



665 



670 



Asp Glu Glu Lys Glu Lys Arg Glu Glu Ser Val Gin Lys Cys Glu Val 
675 680 685 

Glu Lys Lys Pro Glu lie Gin Glu Lys Pro Asn Lys Pro Phe His Gin 
690 695 700 

Pro Pro Glu Pro Gly Lys Leu Gly Gly Gin He Pro Trp Met Asn Thr 
705 710 715 720 

Glu Lys Ala Pro Leu Thr He Asn Gin Gly Asp Val Lys Val Val Tyr 
725 730 735 

Tyr Arg Ala Leu Tyr Pro Phe Asp Ala Arg Ser His Asp Glu He Thr 
740 745 750 

He Glu Pro Gly Asp He He Met Val Asp Glu Ser Gin Thr Gly Glu 
755 760 765 

Pro Gly Trp Leu Gly Gly Glu Leu Lys Gly Lys Thr Gly Trp Phe Pro 
770 775 780 

Ala Asn Tyr Ala Glu Arg Met Pro Glu Ser Glu Phe Pro Ser Thr Thr 
785 790 795 800 

Lys Pro Ala Ala Glu Thr Thr Ala Lys Pro Thr Val His Val Ala Pro 
805 810 815 

Ser Pro Val Ala Pro Ala Ala Phe Thr Asn Thr Ser Thr Asn Ser Asn 
820 825 830 

Asn Trp Ala Asp Phe Ser Ser Thr Trp Pro Thr Asn Asn Thr Asp Lys 
835 840 845 

Val Glu Ser Asp Asn Trp Asp Thr Trp Ala Ala Gin Pro Ser Leu Thr 
850 855 860 

Val Pro Ser Ala Gly Gin His Arg Gin Arg Ser Ala Phe Thr Pro Ala 
865 870 875 880 

Thr Val Thr Gly Ser Ser Pro Ser Pro Val Leu Gly Gin Gly Glu Lys 
885 890 895 

Val Glu Gly Leu Gin Ala Gin Ala Leu Tyr Pro Trp Arg Ala Lys Lys 
900 905 910 

Asp Asn His Leu Asn Phe Asn Lys Asn Asp Val He Thr Val Leu Glu 
915 920 925 

Gin Gin Asp Met Trp Trp Phe Gly Glu Val Gin Gly Gin Lys Gly Trp 
930 935 940 

Phe Pro Lys Ser Tyr Val Lys Leu He Ser Gly Pro Leu Arg Lys Ser 
945 950 955 960 



Thr Ser He Asp Ser Thr Ser Ser Glu Ser Pro Ala Ser Leu Lys Arg 



965 970 975 

Val Ser Ser Pro Ala Phe Lys Pro Ala He Gin Gly Glu Glu Tyr He 
980 985 990 

Ser Met Tyr Thr Tyr Glu Ser Asn Glu Gin Gly Asp Leu Thr Phe Gin 
995 1000 1005 

Gin Gly Asp Leu He Val Val He Lys Lys Asp Gly Asp Trp Trp Thr 
1010 1015 1020 

Gly Thr Val Gly Glu Lys Thr Gly Val Phe Pro Ser Asn Tyr Val Arg 
1025 1030 1035 1040 

Pro Lys Asp Ser Glu Ala Ala Gly Ser Gly Gly Lys Thr Gly Ser Leu 
1045 1050 1055 

Gly Lys Lys Pro Glu He Ala Gin Val He Ala Ser Tyr Ala Ala Thr 
1060 1065 1070 

Ala Pro Glu Gin Leu Thr Leu Ala Pro Gly Gin Leu He Leu He Arg 
1075 1080 1085 

Lys Lys Asn Pro Gly Gly Trp Trp Glu Gly Glu Leu Gin Ala Arg Gly 
1090 1095 1100 

Lys Lys Arg Gin He Gly Trp Phe Pro Ala Asn Tyr Val Lys Leu Leu 
1105 lHO 1115 1120 

Ser Pro Gly Thr Asn Lys Ser Thr Pro Thr Glu Pro Pro Lys Pro Thr 
1125 H30 1135 

Ser Leu Pro Pro Thr Cys Gin Val He Gly Met Tyr Asp Tyr He Ala 
1140 1145 1150 

Gin Asn Asp Asp Glu Leu Ala Phe Ser Lys Gly Gin Val He Asn Val 
1155 H60 1165 

Leu Asn Lys Glu Asp Pro Asp Trp Trp Lys Gly Glu Leu Asn Gly His 
1170 1175 1180 

Val Gly Leu Phe Pro Ser Asn Tyr Val Lys Leu Thr Thr Asp Met Asp 
1185 H90 1195 1200 

Pro Ser Gin Gin Phe Arg Leu Gly Val Lys Pro Ala Gly Gly He Pro 
1205 1210 1215 

Ala Thr Gly Asp Arg Pro Phe He Leu Phe Pro Phe Arg Asp Gly Pro 
1220 1225 1230 

Ser Leu Leu Pro Asn Ala Phe Gin Ala Pro Pro Leu Ser Val Val Met 
1235 1240 1245 

He Lys Phe Arg Cys Phe Thr Ala Pro Arg Phe Cys Pro Asp Met Asn 
1250 1255 1260 



Val Lys Tyr He Asn He 



1265 



1270 



<210> 108 
<211> 1094 
<212> PRT 

<213> Drosophila sp. 
<400> 108 

Met Asn Ser Ala Val Asp Ala Trp Ala Val Thr Pro Arg Glu Arg Leu 
15 10 15 

Lys Tyr Gin Glu Gin Phe Arg Ala Leu Gin Pro Gin Ala Gly Phe Val 
20 25 30 

Thr Gly Ala Gin Ala Lys Gly Phe Phe Leu Gin Ser Gin Leu Pro Pro 
35 40 45 

Leu lie Leu Gly Gin lie Trp Ala Leu Ala Asp Thr Asp Ser Asp Gly 
50 55 60 

Lys Met Asn lie Asn Glu Phe Ser lie Ala Cys Lys Leu lie Asn Leu 
65 70 75 80 

Lys Leu Arg Gly Met Asp Val Pro Lys Val Leu Pro Pro Ser Leu Leu 
85 90 95 

Ser Ser Leu Thr Gly Asp Val Pro Ser Met Thr Pro Arg Gly Ser Thr 
100 105 110 

Ser Ser Leu Ser Pro Leu Asp Pro Leu Lys Gly lie Val Pro Ala Val 
115 120 125 

Ala Pro Val Val Pro Val Val Ala Pro Pro Val Ala Val Ala Thr Val 
130 135 140 

lie Ser Pro Pro Gly Val Ser Val Pro Ser Gly Pro Thr Pro Pro Thr 
145 150 155 160 

Ser Asn Pro Pro Ser Arg His Thr Ser lie Ser Glu Arg Ala Pro Ser 
165 • 170 175 

lie Glu Ser Val Asn Gin Gly Glu Trp Ala Val Gin Ala Ala Gin Lys 
180 185 190 

Arg Lys Tyr Thr Gin Val Phe Asn Ala Asn Asp Arg Thr Arg Ser Gly 
195 200 205 

Tyr Leu Thr Gly Ser Gin Ala Arg Gly Val Leu Val Gin Ser Lys Leu 
210 215 220 

Pro Gin Val Thr Leu Ala Gin lie Trp Thr Leu Ser Asp lie Asp Gly 
225 230 235 240 



Asp Gly Arg Leu Asn Cys Asp Glu Phe lie Leu Ala Met Phe Leu Cys 
245 250 255 



Glu Lys Ala Met Ala Gly Glu Lys He Pro Val Thr Leu Pro Gin Glu 
260 265 < 270 



Trp Val Pro Pro Asn Leu Arg Lys He Lys Ser Arg Pro Gly Ser Val 
275 280 285 

Ser Gly Val Val Ser Arg Pro Gly Ser Gin Pro Ala Ser Arg His Ala 
290 295 300 

Ser Val Ser Ser Gin Ser Gly Val Gly Val Val Asp Ala Asp Pro Thr 
305 310 315 320 

Ala Gly Leu Pro Gly Gin Thr Ser Phe Glu Asp Lys Arg Lys Glu Asn 
325 330 335 

Tyr Val Lys Gly Gin Ala Glu Leu Asp Arg Arg Arg Lys He Met Glu 
340 345 350 

Asp Gin Gin Arg Lys Glu Arg Glu Glu Arg Glu Arg Lys Glu Arg Glu 
355 360 365 

Glu Ala Asp Lys Arg Glu Lys Ala Arg Leu Glu Ala Glu Arg Lys Gin 
370 375 380 

Gin Glu Glu Leu Glu Arg Gin Leu Gin Arg Gin Arg Glu He Glu Met 
385 390 395 400 

Glu Lys Glu Glu Gin Arg Lys Arg Glu Leu Glu Ala Lys Glu Ala Ala 
405 410 415 



Arg Lys Glu Leu Glu Lys Gin Arg 
420 

He Ala Glu Met Asn Ala Gin Lys 
435 440 

Lys Gin Lys Ala His Asn Thr Gin 
450 455 



Gin Gin Glu Trp Glu Gin Ala Arg 
425 430 

Glu Arg Glu Gin Glu Arg Val Leu 
445 

Leu Asn Val Glu Leu Ser Thr Leu 
460 



Asn Glu Lys He Lys Glu Leu Ser Gin Arg He Cys Asp Thr Arg Ala 
465 470 475 480 

Gly Val Thr Asn Val Lys Thr Val He Asp Gly Met Arg Thr Gin Arg 
485 490 495 

Asp Thr Ser Met Ser Glu Met Ser Gin Leu Lys Ala Arg He Lys Glu 
500 505 510 

Gin Asn Ala Lys Leu Leu Gin Leu Thr Gin Glu Arg Ala Lys Trp Glu 
515 520 525 

Ala Lys Ser Lys Ala Ser Gly Ala Ala Leu Gly Gly Glu Asn Ala Gin 
530 535 540 



Gin Glu Gin Leu Asn Ala Ala Phe Ala His Lys Gin Leu He He Asn 
545 550 555 560 



Gin He Lys Asp Lys Val Glu Asn He Ser Lys Glu He Glu Ser Lys 
565 570 575 



Lys Glu Asp He Asn Thr Asn Asp Val Gin Met Ser Glu Leu Lys Ala 
580 585 590 

Glu Leu Ser Ala Leu He Thr Lys Cys Glu Asp Leu Tyr Lys Glu Tyr 
595 600 605 

Asp Val Gin Arg Thr Ser Val Leu Glu Leu Lys Tyr Asn Arg Lys Asn 
610 615 620 

Glu Thr Ser Val Ser Ser Ala Trp Asp Thr Gly Ser Ser Ser Ala Trp 
625 630 635 640 

Glu Glu Thr Gly Thr Thr Val Thr Asp Pro Tyr Ala Val Ala Ser Asn 
645 650 655 

Asp He Ser Ala Leu Ala Ala Pro Ala Val Asp Leu Gly Gly Pro Ala 
660 665 670 

Pro Glu Gly Phe Val Lys Tyr Gin Ala Val Tyr Glu Phe Asn Ala Arg 
675 680 685 

Asn Ala Glu Glu He Thr Phe Val Pro Gly Asp He He Leu Val Pro 
690 695 700 

Leu Glu Gin Asn Ala Glu Pro Gly Trp Leu Ala Gly Glu He Asn Gly 
705 710 715 720 

His Thr Gly Trp Phe Pro Glu Ser Tyr Val Glu Lys Leu Glu Val Gly 
725 730 735 

Glu Val Ala Pro Val Ala Ala Val Glu Ala Pro Val Asp Ala Gin Val 
740 745 750 

Ala Asp Thr Tyr Asn Asp Asn He Asn Thr Ser Ser He Pro Ala Ala 
755 760 765 

Ser Ala Asp Leu Thr Ala Ala Gly Asp Val Glu Tyr Tyr He Ala Ala 
770 775 780 

Tyr Pro Tyr Glu Ser Ala Glu Glu Gly Asp Leu Ser Phe Ser Ala Gly 
785 790 795 800 

Glu Met Val Met Val He Lys Lys Glu Gly Glu Trp Trp Thr Gly Thr 
805 810 815 

He Gly Ser Arg Thr Gly Met Phe Pro Ser Asn Tyr Val Gin Lys Ala 
820 825 830 

Asp Val Gly Thr Ala Ser Thr Ala Ala Ala Glu Pro Val Glu Ser Leu 
835 840 845 

Asp Gin Glu Thr Thr Leu Asn Gly Asn Ala Ala Tyr Thr Ala Ala Pro 
850 855 860 



Val Glu Ala Gin Glu Gin Val Tyr Gin Pro Leu Pro Val Gin Glu Pro 
865 870 875 880 

Ser Glu Gin Pro He Ser Ser Pro Gly Val Gly Ala Glu Glu Ala His 
885 890 895 

Glu Asp Leu Asp Thr Glu Val Ser Gin He Asn Thr Gin Ser Lys Thr 
900 905 910 

Gin Ser Ser Glu Pro Ala Glu Ser Tyr Ser Arg Pro Met Ser Arg Thr 
915 920 925 

Ser Ser Met Thr Pro Gly Met Arg Ala Lys Arg Ser Glu He Ala Gin 
930 935 940 

Val He Ala Pro Tyr Glu Ala Thr Ser Thr Glu Gin Leu Ser Leu Thr 
945 950 955 960 

Arg Gly Gin Leu He Met He Arg Lys Lys Thr Asp Ser Gly Trp Trp 
965 970 975 

Glu Gly Glu Leu Gin Ala Lys Gly Arg Arg Arg Gin He Gly Trp Phe 
980 985 990 

Pro Ala Thr Tyr Val Lys Val Leu Gin Gly Gly Arg Asn Ser Gly Arg 
995 1000 1005 

Asn Thr Pro Val Ser Gly Ser Arg He Glu Met Thr Glu Gin He Leu 
1010 1015 1020 

Asp Lys Val He Ala Leu Tyr Pro Tyr Lys Ala Gin Asn Asp Asp Glu 
1025 1030 1035 1040 

Leu Ser Phe Asp Lys Asp Asp He He Ser Val Leu Gly Arg Asp Glu 
1045 1050 1055 

Pro Glu Trp Trp Arg Gly Glu Leu Asn Gly Leu Ser Gly Leu Phe Pro 
1060 1065 1070 

Ser Asn Tyr Val Gly Pro Phe Val Thr Ser Gly Lys Pro Ala Lys Ala 
1075 1080 1085 



Asn Gly Thr Thr Lys Lys 
1090 



<210> 109 
<211> 520 
<212> PRT 

<213> Homo sapiens 
<400> 109 

Met Glu Ala Glu Arg Leu Lys Gin Lys Glu Gin Glu Arg Lys He He 
1 5 10 15 



Glu Leu Glu Lys Gin Lys Glu Glu Ala Gin Arg Arg Ala Gin Glu Arg 
20 25 30 



Asp Lys Gin Trp Leu Glu His Val Gin Gin Glu Asp Glu His Gin Arg 
35 40 45 

Pro Arg Lys Leu His Glu Glu Glu Lys Leu Lys Arg Glu Glu Ser Val 
50 55 60 

Lys Lys Lys Asp Gly Glu Glu Lys Gly Lys Gin Glu Ala Gin Asp Lys 
65 70 75 80 

Leu Gly Arg Leu Phe His Gin His Gin Glu Pro Ala Lys Pro Ala Val 
85 90 95 

Gin Ala Pro Trp Ser Thr Ala Glu Lys Gly Pro Leu Thr He Ser Ala 
100 105 HO 

Gin Glu Asn Val Lys Val Val Tyr Tyr Arg Ala Leu Tyr Pro Phe Glu 
115 120 125 

Ser Arg Ser His Asp Glu He Thr He Gin Pro Gly Asp He Val Met 
130 135 140 

Val Asp Glu Ser Gin Thr Gly Glu Pro Gly Trp Leu Gly Gly Glu Leu 
145 150 155 160 

Lys Gly Lys Thr Gly Trp Phe Pro Ala Asn Tyr Ala Glu Lys He Pro 
165 170 175 

Glu Asn Glu Val Pro Ala Pro Val Lys Pro Val Thr Asp Ser Thr Ser 
180 185 190 

Ala Pro Ala Pro Lys Leu Ala Leu Arg Glu Thr Pro Ala Pro Leu Ala 
195 200 205 

Val Thr Ser Ser Glu Pro Ser Thr Thr Pro Asn Asn Trp Ala Asp Phe 
210 215 220 

Ser Ser Thr Trp Pro Thr Ser Thr Asn Glu Lys Pro Glu Thr Asp Asn 
225 230 235 240 

Trp Asp Ala Trp Ala Ala Gin Pro Ser Leu Thr Val Pro Ser Ala Gly 
245 250 255 

Gin Leu Arg Gin Arg Ser Ala Phe Thr Pro Ala Thr Ala Thr Gly Ser 
260 265 270 

Ser Pro Ser Pro Val Leu Gly Gin Gly Glu Lys Val Glu Gly Leu Gin 
275 280 285 

Ala Gin Ala Leu Tyr Pro Trp Arg Ala Lys Lys Asp Asn His Leu Asn 
290 295 300 

Phe Asn Lys Asn Asp Val He Thr Val Leu Glu Gin Gin Asp Met Trp 
305 310 315 320 

Trp Phe Gly Glu Val Gin Gly Gin Lys Gly Trp Phe Pro Lys Ser Tyr 
325 330 335 



Val Lys Leu lie Ser Gly Pro lie Arg Lys Ser Thr Ser Met Asp Ser 
340 345 350 

Gly Ser Ser Glu Ser Pro Ala Ser Leu Lys Arg Val Ala Ser Pro Ala 
355 360 365 

Ala Lys Pro Val Val Ser Gly Glu Glu lie Ala Gin Val He Ala Ser 
370 375 380 

Tyr Thr Ala Thr Gly Pro Glu Gin Leu Thr Leu Ala Pro Gly Gin Leu 
385 390 395 400 

He Leu He Arg Lys Lys Asn Pro Gly Gly Trp Trp Glu Gly Glu Leu 
405 410 415 

Gin Ala Arg Gly Lys Lys Arg Gin He Gly Trp Phe Pro Ala Asn Tyr 
420 425 430 

Val Lys Leu Leu Ser Pro Gly Thr Ser Lys He Thr Pro Thr Glu Pro 
435 440 445 

Pro Lys Ser Thr Ala Leu Ala Ala Val Cys Gin Val He Gly Met Tyr 
^ 450 455 460 

Asp Tyr Thr Ala Gin Asn Asp Asp Glu Leu Ala Phe Asn Lys Gly Gin 
465 470 475 480 



5«? 



lie He Asn Val Leu Asn Lys Glu Asp Pro Asp Trp Trp Lys Gly Glu 
485 490 495 



§ Val Asn Gly Gin Val Gly Leu Phe Pro Ser Asn Tyr Val Lys Leu Thr 
0 500 505 510 

Thr Asp Met Asp Pro Ser Gin Gin 
f\l 515 520 

3 
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IS OLATED SH3 GENES ASSOCIATED WITH MYELOPR OL IFERATIVE 
DISORDERS AND LEUKEMIA, AND USES THEREOF 
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The research leading to the present invention was supported in part by the Clinical 
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government may have certain rights in the present invention. 

FIELD OF THE INVENTION 

10 The present invention relates to the isolated nucleic acids and corresponding amino acids 
of a series of SH3 genes, analogs, fragments, mutants, and variants thereof. The 
invention provides polypeptides, fusion proteins, chimerics, antisense molecules, 
antibodies, and uses thereof. Also, this invention is directed to diagnostic methods of 
determining whether a subject has a megakaryocyte abnormality, myeloproliferative 

15 disorder, platelet disorder, hematopoietic disorder, or leukemia, or disorders associated 
with abnormal neural development, and therapeutic treatments thereof. 

BACKGROUND OF THE INVENTION 

20 Down syndrome, caused by trisomy of human chromosome 21 (HSA21), is the most 
common autosomal form of mental retardation. The first report describing an 
association between Down syndrome (DS) and leukemia, which are an important cause 
of morbidity and mortality worldwide, was presented in 1930. Since that time, the 
increased incidence of acute leukemia in patients with DS has been clearly established . 

25 However, the M7 subtype, AMKL, acute megakaryoblastic leukemia has been found 
to be common in DS but relatively rare in non-DS . An instability in the control of 
bone marrow proliferation has been hypothesized as a predisposing factor. The 
incidence of acute myelogenous leukemia patients with DS has been noted by some to 
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be similar to that in children without mongolism. Chromosome 21 is a model for the 
study of human chromosomal aneuploidy, and the construction of its physical and 
transcriptional maps is a necessary step in understanding the molecular basis of 
aneuploidy-dependent phenotypes. 

5 

Human chromosome 21 has a nearly complete physical map with a well-characterized 
contiguous set of overlapping YACs spanning most of its length (Chumakov et al., 
1992; Shimizu et al., 1995; Korenberg et al., 1995). The demand for sequence-ready 
contigs and clones for gene isolation efforts has prompted the construction of numerous 

10 higher resolution contigs in cosmids (Patil et al . , 1994; Soeda et al. , 1995) and, more 
recently, in Pl-derived artificial chromosomes (PACs; Oegawa et al. 1996 and Hubert 
et al. ( 1997) Genomics 41:218-226). Considerable mapping efforts exist in the region 
from CBR to D21S55 due to the common duplication of the region in partially trisomi c 
individuals with several phenotypic features of DS, including mental retardation. 

15 However, the distal and adjacent, 4- to 5-Mb D2 1S55 to MX1 region is also associate d 
with DS-CHD as well as other characteristic features of DS (Korenberg et al., 1992, 
1994). 

Although full monosomy of chromosome 21 is usually lethal in utero, there are rare 
20 cases of individuals with chromosome 21 deletions who survive. These individuals 
exhibit a characteristic subset of clinical features including psychomotor and growth 
retardation, congenital heart disease, holoprosencephaly, microphthalmia, skeletal 
malformations, and genital hypoplasia. Megakaryocyte abnormalities is added to this 
set and define a minimal "overlap" region for this feature through the clinical, 
25 - cytogenetic, and molecular analysis of four patients with overlapping deletions of 
chromosome 21 and thrombocytopenia. 

Nonchimeric YACs span this interval with a few gaps but higher resolution physical 
maps are not available for most of the D21S55 to MX1 region. DEL21RW carries tw o 
30 interstitial deletions, one in 21q21 .3-22. 1 defined by YAC 62G5 through YAC 760H5, 
and the second in 21q22.2, deleting IFNAR through CBR. DEL21LS carries an 
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interstitial deletion of 21q22. 1 from YAC 760H5 through the AML1 gene. Korenberg 
et al. reported that the deletion of patient DEL21HJ includes D21S93 through AML1. 
DEL21SV has a possible terminal deletion, 21q22.13-qter, extending from just 
proximal to D21S324 through D21S123. The common deleted region, or overlap 

5 region, is therefore from D21S324 through AML1, a region of less than 2Mb that 
contains only three known genes, AML1, KCNE1, and UN02. Bone marrow 
examination of two of the patients, DEL21HJ and Del 21RW, showed normocellular 
marrow with normal myelopoiesis, normal erythropoiesis, and small, dysplastic 
megakaryocytes with hypolobated nuclei. These two patients have decreased platelet 

10 activation by agonists with normal platelet ultrastructures. All four patients have 
platelet dysfunction characterized by low platelet counts in the range of 31-113 x 10 9 
/L. Further, all four subjects with chromosome 21 deletions that do not include this 
region have normal number of platelets. 

15 A3' fragment of SH3P17 gene was found in a study to isolate SH3 domain containing 
genes (Sparks et al. 1996, Nature Biotechnology 14:741). This was mapped to 21 or 
large sub-region of 21 by a number of groups by using database matches to the 
published sequence. Katsanis N, et al (Hum Genet 1997 Sep; 100(3-4): 477-480) utilized 
information generated by various EST sequencing projects to enrich the transcription 

20 map of chromosome 21 and report the mapping of SH3P17 to 21q22.1 and the 
localisation of two genes previously mapped to HSA21 by Nagase and colleagues, 
KIAA0136 and KIAA0179 to 21q22.2 and 21q22.3 respectively. Chen H, and 
Antonarakis SE (Cytogenet Cell Genet 1997;78(3-4):213-215) identified portions of 
genes on human chromosome 21 and mapped the gene to YACs and cosmids within 

25 21q22.1-->q22.2 between DNA markers D21S319 and D21S65 using hybridization 
and PCR amplification. Lastly, Guipponi et. al. 1998, Genomics 53:369-376 reported 
that they identified two isoforms of the human homolog of Xenopus Intersectin (ITSN) 
produced from alternate transcripts, the first of which, a short transcript is reportedly 
ubiquitously expressed, while the second longer transcript is exclusively expressed in 

30 brain tissue. Later, Guipponi et. al. 1998 Cytogenet Cell Genet. 

83:218-220 reported that they had identified the genomic structure, sequence and 
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precise mapping of the human intersectin gene and speculated that it may play a role 
in the determination of certain of the phenotypic characteristics of Down syndrome. 
The authors did not present evidence and corresponding observations or speculation 
regarding the role of the discovered genes apart from a possible relation to Down 
5 syndrome, and as such, are distinguishable from the research and discoveries embodied 
in the present invention. 

The present invention provides the complete nucleotide sequence of several SH3 
genes, including the SH3D1A gene and clones thereof, their association with platelet 
10 dysfunction and leukemia, including a part of the increased risk of leukemia seen in 
Down Syndrome, and with dysfunctions associated with neural development and 
particularly development in the CNS. 

SUMMARY OF THE INVENTION 

15 

In one embodiment, this invention provides isolated nucleic acids which encode human 
SH3 genes such as SH3D1A and cDNA clones thereof, including also analogs, 
fragments, variants, and mutants, thereof This invention is directed to an isolated 
nucleic acid encoding an amino acid sequence which forms one or more myristoylation 

20 sites in the EH domain and SH3 domain. This invention provides an isolated nucleic acid 
encoding an amino acid sequence which forms one or more EH domains and one or more 
SH3 domains. In one embodiment the nucleic acid which encodes an amino acid 
sequence which forms two EH domains and four SH3 domains. As shown in Figure 1 the 
nucleic acid encoding the amino acid sequence comprises one or more myristoylation 

25 sites in the EH domain and SH3 domain. 

In one embodiment of this invention, the isolated nucleic acid encodes an amino acid 
sequence of the EH1 domain which is from amino acid sequence 15 to sequence 102. In 
another embodiment of this invention, the nucleic acid encodes an amino acid sequence 
30 of the EH2 domain which is from amino acid sequence 2 1 5 to sequence 3 1 0. In another 
embodiment of this invention, the nucleic acid encodes an amino acid sequence of the 
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SH3-1 domain which is from amino acid sequence 740 to sequence 800. In another 
embodiment of this invention, the nucleic acid encodes an amino acid sequence of the 
SH3-2 domain which is from amino acid sequence 908 to sequence 966. In another 
embodiment of this invention, the nucleic acid encodes an amino acid sequence of the 

5 SH3-3 domain which is from amino acid sequence 999 to sequence 1062, In another 
embodiment of this invention, the nucleic acid encodes an amino acid sequence of the 
SH3-4 domain which is from amino acid sequence 1080 to sequence 1138. In another 
embodiment of this invention, the nucleic acid encodes an amino acid sequence of the 
SH3-1 domain which is from amino acid sequence 740 to sequence 800. In a preferred 

10 embodiment, the nucleic acid encodes an amino acid sequence as set forth in SEQ. ID. 
NO. 2, and as set forth in Figures 5, 9, 11, 13 and 15. 

This invention provides for an isolated nucleic acid which encodes SH3D1 A, and clones 
thereof as set forth herein. The isolated nucleic acid may be DNA or RNA, specifically 

15 cDNA or genomic DNA. This isolated nucleic acid also encodes mutant SH3D1 A or the 
wildtype protein. The isolated nucleic acid may also encode a human SH3D1 A having 
substantially the same amino acid sequence as the sequence designated Figure 5. As used 
herein and in the claims, the terms nucleic acids encoding or expressing SH3D1A is 
intended to comprehend and include isolated nucleic acids that may have the sequence 

20 set forth in Figures 4, 8, 10, 12 or 14. 

This invention is directed to a polypeptide comprising the amino acid sequence of a 
human SH3D1 A or to a clone thereof. As used herein and in the claims, polypeptide or 
protein of SH3D1A is intended to comprehend and include polypeptides that comprise 
25 or otherwise correspond to those set forth in Figures 9, 1 1, 13, or 15 herein, or analogs 
or fragments thereof. Further, polyclonal and monoclonal antibodies which specifically 
bind to the polypeptide are disclosed and chimeric (bi-specific) antibodies are likewise 
contemplated. 

30 This invention provides a method for determining whether a subject carries a mutation 
in the SH3D1 A gene which comprises: (a) obtaining an appropriate nucleic acid sample 
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from the subject; and (b) determining whether the nucleic acid sample from step (a) is, 
or is derived from, a nucleic acid which encodes mutant SH3D1A so as to thereby 
determine whether a subject carries a mutation in the SH3D1 A gene. 

5 This invention provides a method for determining whether a subject has a 
megakaryocyte abnormality, myeloproliferative disorder, platelet disorder, or leukemia, 
or a neural disorder which comprises: (a) obtaining an appropriate sample from the 
subject; and (b) contacting the sample with the antibody so as to thereby determine 
whether a subject has the megakaryocyte abnormality, myeloproliferative disorder, 

10 platelet disorder, leukemia or neural disorder. 

This invention provides a method for determining whether a subject has a predisposition 
for a megakaryocytic abnormality, myeloproliferative disorder, platelet disorder, 
leukemia, or a neural disorder, which comprises: (a) obtaining an appropriate nucleic 
15 acid sample from the subject; and (b) determining whether the nucleic acid sample from 
step (a) is, or is derived from, a nucleic acid which encodes SH3D1 A so as to thereby 
determine whether a subject has a predisposition for a megakaryocytic abnormality, 
myeloproliferative disorder, platelet disorder or leukemia, or a neural disorder. 

20 This invention provides a method for determining whether a subject has a 
megakaryocytic abnormality, myeloproliferative disorder, platelet disorder, leukemia, or 
a neural disorder, which comprises: (a) obtaining an appropriate nucleic acid sample 
from the subject; and (b) determining whether the nucleic acid sample from step (a) is, 
or is derived from, a nucleic acid which encodes the human SH3D1 A so as to thereby 

25 determine whether a subject has megakaryocytic abnormality, myeloproliferative 
disorder, platelet disorder, leukemia, or a neural disorder,. 

This invention provides a method for screening a tumor sample from a human subject for 
a somatic alteration in a SH3D1 A gene in said tumor which comprises gene comparing 
30 a first sequence selected form the group consisting of a SH3D1A gene from said tumor 
sample, SH3D1A RNA from said tumor sample and SH3D1 A cDNA made from mRNA 
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from said tumor sample with a second sequence selected from the group consisting of 
SH3D1A gene from a nontumor sample of said subject, SH3D1A RNA from said 
nontumor sample and SH3D1 A cDNA made from rnRNA from said nontumor sample, 
wherein a difference in the sequence of the SH3D1 A gene, SH3D1A RNA or SH3D1A 
5 cDNA from said tumor sample from the sequence of the SH3D1 A gene, SH3D1 A RNA 
or SH3D1A cDNA from said nontumor sample indicates a somatic alteration in the 
SH3D1 A gene in said tumor sample. 

This invention provides a method for monitoring the progress and adequacy of treatment 
10 in a subject who has received treatment for a megakaryocytic abnormality, 
myeloproliferative disorder, platelet disorder, leukemia or an abnormal neural condition 
which comprises monitoring the level of nucleic acid encoding the human SH3D1 A at 
various stages of treatment. 

15 The present invention provides the means necessary for production of gene-based 
therapies directed at cancer cells; diagnosis of the predisposition to, and diagnosis and 
treatment of megakaryocytic abnormality, hematopoietic disorders, myeloproliferative 
disorder, platelet disorder, Down Syndrome, leukemia, other disorders based in whole 
or in part from neural abnormalities or dysfunctions; and prenatal diagnosis and 

20 treatment of tumors. These therapeutic agents may take the form of polynucleotides 
comprising all or a portion of the SH3D1A gene placed in appropriate vectors or 
delivered to target cells in more direct ways such that the function of the SH3D1 A 
protein is reconstituted. Therapeutic agents may also take the form of polypeptides 
based on either a portion of, or the entire protein sequence of SH3D1 A. 

25 

This invention provides a pharmaceutical composition comprising an amount of the 
polypeptide of the human SH3D1 A as defined herein, and a pharmaceutically effective 
carrier or diluent. 

30 This invention provides a method of treating a subject having megakaryocytic 
abnormality, myeloproliferative disorder, platelet disorder, leukemia or neural 
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abnormality or dysfunction, which comprises introducing the isolated nucleic acid into 
the subject under conditions such that the nucleic acid expresses SH3D1A, so as to 
thereby treat the subject. 

5 This invention provides a method of treating a subject having megakaryocyte 
abnormality, myeloproliferative disorder, platelet disorder, leukemia, or neural 
abnormality or dysfunction, which comprises administration to the subject a 
therapeutically effective amount of the pharmaceutical composition to the subject, 

10 Lastly, the present invention also provides kits for detecting in an analyte at least one 
oligonucleotide comprising the SH3D1A gene, or a portion thereof, the kits comprising 
polynucleotide complementary to the SH3D1A gene, a fragment, binding partner, 
analog or other portion thereof, gene packaged in a suitable container, and instructions 
for its use. 

15 

BRIEF DESCRIPTION OF THE DRAWINGS 
FIGURE 1 . Human SH3D 1 A structure and homology 
20 FIGURE 2. SH3D1 A domain structure and homologies - human vs. Xenopus 

FIGURE 3. Region of chromosome 21 responsible for megakaryocytic abnormalities. 
FIGURE 4. Nucleic acid sequence of human SH3D1A. 

25 

FIGURES. Amino acid sequence of human SH3D1 A. 

FIGURE 6, Northern Blot of SH3D1 A expressed in heart, brain, placenta, lung, liver, 
muscle, kidney and pancreas, 

30 

FIGURE 7. Map presenting four cDNA clones in accordance with the invention, 
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including length and protein domains. 

FIGURE 8. Nucleic acid sequence of cDNA clone also identified herein as Clone #21 . 

5 FIGURE 9. Amino acid sequence of Clone #21. Upper part of Figure presents 
translated protein sequence; lower portion of Figure presents whole 
protein sequence. 

FIGURE 10. Nucleic acid sequence of cDNA clone also identified herein as Clone #1 1 . 

10 

FIGURE 11. Amino acid sequence of Clone #11. Upper part of Figure presents 
translated protein sequence; lower portion of Figure presents whole 
protein sequence. 

15 

FIGURE 12. Nucleic acid sequence of cDNA clone also identified herein as Clone #5. 

FIGURE 13. Amino acid sequence of Clone #5. Upper part of Figure presents 
translated protein sequence; lower portion of Figure presents whole 
20 protein sequence. 

FIGURE 14. Nucleic acid sequence of cDNA clone also identified herein as Clone #9. 

FIGURE 15. Amino acid sequence of Clone #5. Upper part of Figure presents 
25 translated protein sequence; lower portion of Figure presents whole 

protein sequence. 

FIGURE 16, Tissue immunochemical staining on mouse embryo (Day 9) showing 
ITSN expression in neural blasts during migration and formation in CNS. 

30 

FIGURE 17. Summary of Studies on ITSN: 
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L Gene sequence: First line showing the scale of ITSN cDNA; Second 
line showing the total numbers of the exons and the positions of each 
exon located. 

II. Protein domains vs nucleotide sequence: ITSN was predicted 
5 consists of 1 1 protein domains as listed on the map - 2 EH domains, 5 

SH3 domains and 1 of each GEF, pH and C2 domains. Their relative 
positions on the cDNA level were numbered under each domain. 
Ill* Gene expression of human adult and fetal tissues: This part 
summarized the Northern blot results showing ITSN was ubiquitously 
10 expressed with extensive alternative splicing generating tissue and 

developmental stage-specific expression. 

FIGURE 18. Sequence comparisons between nucleic acid molecules of present 
invention, and Intersectins (ITSN), including a consensus sequence. 



15 



DETAILED DESCRIPTION OF THE INVENTION 



The present invention discloses a family of SH3 genes, and particularly, a novel SH3D1 A 
gene, and clones, and corresponding proteins, both translated and full length, which 

20 SH3D1 A gene is on chromosome 21, and that contributes to the development of platelets 
and the pathogenesis of leukemias, both in general and in particular those involving the 
megakaryocyte lineage. The invention provides methods useful for diagnosing and 
treating the following: acute leukemias, thrombocytopenia, megakaryocyte abnormality, 
hematopoetic disorders, myeloproliferative disorder, platelet disorder, leukemia, 

25 leukemia in Down syndrome, leaukemia, platelet disorder on chromosome 21, low 
platelets in deletion for 21, association of gains in chromosome 21 with leukemias and 
disorders associated with associated with megakaryocyte dysfunction; and neural 
abnormalities, dysfunctions and disorders, including brain malformations and 
corresponding cognitive dysfunctions, microcephaly, lissencephaly, colpocephaly, 

30 holoprosencephaly. 
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This invention provides an isolated nucleic acid which encodes a human SH3D1 A, as 
defined hereinabove, including analogs, such as the nucleic acids set forth in Figures 8, 
10, 12 and 14, fragments, presented herein by way of non-limiting example, variants, and 
mutants, thereof. In one embodiment the nucleic acid has a nucleotide sequence having 
5 at least 85% similarity with the nucleic acid coding sequence of SEQ ID NO: L This 
invention is directed to an isolated nucleic acid encoding an amino acid sequence which 
forms one or more myristoylation sites in the EH domain and SH3 domain.This 
invention provides a isolated nucleic acid encoding an amino acid sequence which forms 
one or more EH domains and one or more SH3 domains. In one embodiment the nucleic 
10 acid which encodes an amino acid sequence which forms two EH domains and four SH3 
domains. As show in Figure 1 the nucleic acid encoding the amino acid sequence 
comprising one or more myristoylation sites in the EH domain and SH3 domain. 

In one embodiment of this invention, the isolated nucleic acid encodes an amino acid 

15 sequence of the EH1 domain which corresponds to the following regions: amino acid 
sequence 15 to sequence 102. In another embodiment of this invention, the nucleic acid 
encodes an amino acid sequence of the EH2 domain which is from amino acid sequence 
215 to sequence 3 10. In another embodiment of this invention, the nucleic acid encodes 
an amino acid sequence of the SH3-1 domain which is from amino acid sequence 740 to 

20 sequence 800. In another embodiment of this invention, the nucleic acid encodes an 
amino acid sequence of the SH3-2 domain which is from amino acid sequence 908 to 
sequence 966. In another embodiment of this invention, the nucleic acid encodes an 
amino acid sequence of the SH3-3 domain which is from amino acid sequence 999 to 
sequence 1062. In another embodiment of this invention, the nucleic acid encodes an 

25 amino acid sequence of the SH3-4 domain which is from amino acid sequence 1080 to 
sequence 1138. In another embodiment of this invention, the nucleic acid encodes an 
amino acid sequence of the SH3-1 domain which is from amino acid sequence 740 to 
sequence 800. In a preferred embodiment, the nucleic acid encodes an amino acid 
sequence as set forth in Figure 5, or the corresponding analogs set forth in Figures 9, 1 1, 

30 13 and 15, presented herein by way of non-limiting example. This invention 
contemplates nucleic acid or amino acid sequences which correspond to the SH3D1 A 
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gene, analogs, fragments, variants, mutants thereof. The corresponding nucleic acids or 
amino acids may be based on nucleic acid, or amino acid sequence as disclosed herein; 
or based on the structurally or functionally of the EH and SH3 domains which define the 
SH3D1A gene. 

5 

This invention provides for an isolated nucleic acid which encodes SH3D1A. This 
isolated nucleic acid may be DNA or RNA, specifically cDNA or genomic DNA. This 
isolated nucleic acid also encodes mutant SH3D 1 A or the wildtype protein. The isolated 
nucleic acid may also encode a human SH3D1 A having substantially the same amino 
10 acid sequence as the sequence designated Figure 5. Specifically the isolated nucleic acid 
has the sequence designated Figure 4. 

This invention provides for a replicable vector comprising the isolated nucleic acid 
molecule of the DNA virus. The vector includes, but is not limited to: a plasmid, cosmid, 

15 X phage or yeast artificial chromosome (YAC) which contains at least a portion of the 
isolated nucleic acid molecule. As an example to obtain these vectors, insert and vector 
DNA can both be exposed to a restriction enzyme to create complementary ends on both 
molecules which base pair with each other and are then ligated together with DNA ligase. 
Alternatively, linkers can be ligated to the insert DNA which correspond to a restriction 

20 site in the vector DNA, which is then digested with the restriction enzyme which cuts at 
that site. Other means are also available and known to an ordinary skilled practitioner. 

Regulatory elements required for expression include promoter or enhancer sequences to 
bind RNA polymerase and transcription initiation sequences for ribosome binding. For 

25 example, a bacterial expression vector includes a promoter such as the lac promoter and 
for transcription initiation the Shine-Dalgarno sequence and the start codon AUG. 
Similarly, a eukaryotic expression vector includes a heterologous or homologous 
promoter for RNA polymerase II, a downstream polyadenylation signal, the start codon 
AUG, and a termination codon for detachment of the ribosome. Such vectors may be 

30 obtained commercially or assembled from the sequences described by methods well- 
known in the art, for example the methods described above for constructing vectors in 
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general. 

This invention provides a host cell containing the above vector. The host cell may 
contain the isolated DNA molecule artificially introduced into the host cell The host cell 
5 may be a eukaryotic or bacterial cell (such as E.coliX yeast cells, fungal cells, insect cells 
and animal cells. Suitable animal cells include, but are not limited to Vero cells, HeLa 
cells, Cos cells, CV1 cells and various primary mammalian cells. 

The term "vector", refers to viral expression systems, autonomous self-replicating 
^ 10 circular DNA (plasmids), and includes both expression and nonexpression plasmids. 

Where a recombinant microorganism or cell culture is described as hosting an 
pj "expression vector," this includes both extrachromosomal circular DNA and DNA that 

H 

^ has been incorporated into the host chromosome(s). Where a vector is being maintained 

y by a host cell, the vector may either be stably replicated by the cells during mitosis as an 

* 15 autonomous structure, or is incorporated within the host's genome. 

!^ The term "plasmid" refers to an autonomous circular DNA molecule capable of 

Q replication in a cell, and includes both the expression and nonexpression types. Where 

a recombinant microorganism or cell culture is described as hosting an "expression 
20 plasmid", this includes latent viral DNA integrated into the host chromosome(s). Where 

a plasmid is being maintained by a host cell, the plasmid is either being stably replicated 

by the cells during mitosis as an autonomous structure or is incorporated within the host's 

genome. 

25 The following terms are used to describe the sequence relationships between two or more 
nucleic acid molecules or polynucleotides: "reference sequence", "comparison window", 
"sequence identity", "percentage of sequence identity", and "substantial identity". A 
"reference sequence" is a defined sequence used as a basis for a sequence comparison; 
a reference sequence may be a subset of a larger sequence, for example, as a segment of 

30 a full-length cDNA or gene sequence given in a sequence listing or may comprise a 
complete cDNA or gene sequence. 
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Optimal alignment of sequences for aligning a comparison window may be conducted 
by the local homology algorithm of Smith and Waterman (1981) Adv. Appl Math, 2:482, 
by the homology alignment algorithm of Needleman and Wunsch (1970) J. MoL Biol 
48:443, by the search for similarity method of Pearson and Lipman (1988) Proc. Natl 
5 Acad. Sci. (USA) 85:2444, or by computerized implementations of these algorithms 
(GAP, BESTFIT, FAST A, and TFASTA in the Wisconsin Genetics Software Package 
Release 7.0, Genetics Computer Group, 575 Science Dr., Madison, WI). 

"Substantial identity" or "substantial sequence identity" mean that two peptide sequences, 
10 when optimally aligned, such as by the programs GAP or BESTFIT using default gap 
which share at least 90 percent sequence identity, preferably at least 95 percent sequence 
identity, more preferably at least 99 percent sequence identity or more. "Percentage 
amino acid identity" or "percentage amino acid sequence identity" refers to a comparison 
of the amino acids of two polypeptides which, when optimally aligned, have 
15 approximately the designated percentage of the same amino acids. For example, "95% 
amino acid identity" refers to a comparison of the amino acids of two polypeptides which 
when optimally aligned have 95% amino acid identity. Preferably, residue positions 
which are not identical differ by conservative amino acid substitutions. For example, the 
substitution of amino acids having similar chemical properties such as charge or polarity 
20 are not likely to effect the properties of a protein. Examples include glutamine for 
asparagine or glutamic acid for aspartic acid. 

The phrase "nucleic acid molecule encoding" refers to a nucleic acid molecule which 
directs the expression of a specific protein or peptide. The nucleic acid sequences 

25 include both the DNA strand sequence that is transcribed into RNA and the RNA 
sequence that is translated into protein. The nucleic acid molecule include both the full 
length nucleic acid sequences as well as non-full length sequences derived from the full 
length protein. It being further understood that the sequence includes the degenerate 
codons of the native sequence or sequences which may be introduced to provide codon 

30 preference in a specific host cell. 
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This invention provides a nucleic acid having a sequence complementary to the sequence 
of the isolated nucleic acid of the human SH3D1A gene. Specifically, this invention 
provides an oligonucleotide of at least 15 nucleotides capable of specifically hybridizing 
with a sequence of nucleotides present within a nucleic acid which encodes the human 
5 SH3D1 A. In one embodiment the nucleic acid is DNA or RNA. In another embodiment 
the oligonucleotide is labeled with a detectable marker. In another embodiment the 
oligonucleotide is a radioactive isotope, a fluorophor or an enzyme. 

Oligonucleotides which are complementary may be obtained as follows: The polymerase 
10 chain reaction is then carried out using the two primers. See PCR Protocols: A Guide 
to Methods and Applications [74]. Following PCR amplification, the PCR-amplified 
regions of a viral DNA can be tested for their ability to hybridize to the three specific 
nucleic acid probes listed above. Alternatively, hybridization of a viral DNA to the 
above nucleic acid probes can be performed by a Southern blot procedure without viral 
15 DNA amplification and under stringent hybridization conditions as described herein. 

Oligonucleotides for use as probes or PCR primers are chemically synthesized according 
to the solid phase phosphoramidite triester method first described by Beaucage and 
Carruthers [19] using an automated synthesizer, as described in Needham-VanDevanter 
20 [69], Purification of oligonucleotides is by either native acrylamide gel electrophoresis 
or by anion-exchange HPLC as described in Pearson, J.D. and Regnier, F.E. [75 A]. The 
sequence of the synthetic oligonucleotide can be verified using the chemical degradation 
method of Maxam, AM. and Gilbert, W. [63]. 



25 High stringency hybridization conditions are selected at about 5° C lower than the 
thermal melting point (Tm) for the specific sequence at a defined ionic strength and pH. 
The Tm is the temperature (under defined ionic strength and pH) at which 50% of the 
target sequence hybridizes to a perfectly matched probe. Typically, stringent conditions 
will be those in which the salt concentration is at least about 0.02 molar at pH 7 and the 

30 temperature is at least about 60 °C. As other factors may significantly affect the 
stringency of hybridization, including, among others, base composition and size of the 
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complementary strands, the presence of organic solvents, ie. salt or formamide 
concentration, and the extent of base mismatching, the combination of parameters is 
more important than the absolute measure of any one. For Example high stringency may 
be attained for example by overnight hybridization at about 68 °C in a 6x SSC solution, 
5 washing at room temperature with 6x SSC solution, followed by washing at about 68 °C 
in a 6x SSC in a 0.6x SSX solution. 

Hybridization with moderate stringency may be attained for example by: 1) filter pre- 
hybridizing and hybridizing with a solution of 3x sodium chloride, sodium citrate (SSC), 
10 50% formamide, 0.1M Tris buffer at Ph 7.5, 5x Denhardt's solution; 2.) pre- 
hybridization at 37° C for 4 hours; 3) hybridization at 37°C with amount of labelled 
probe equal to 3,000,000 cpm total for 16 hours; 4) wash in 2x SSC and 0.1% SDS 
solution; 5) wash 4x for 1 minute each at room temperature at 4x at 60 °C for 30 minutes 
each; and 6) dry and expose to film. 

15 

The phrase "selectively hybridizing to" refers to a nucleic acid probe that hybridizes, 
duplexes or binds only to a particular target DNA or RNA sequence when the target 
sequences are present in a preparation of total cellular DNA or RNA. By selectively 
hybridizing it is meant that a probe binds to a given target in a manner that is detectable 

20 in a different manner from non-target sequence under high stringency conditions of 
hybridization, in a different "Complementary" or "target" nucleic acid sequences refer 
to those nucleic acid sequences which selectively hybridize to a nucleic acid probe. 
Proper annealing conditions depend, for example, upon a probe's length, base 
composition, and the number of mismatches and their position on the probe, and must 

25 often be determined empirically. For discussions of nucleic acid probe design and 
annealing conditions, see, for example, Sambrook et aL 9 [81] or Ausubel, R, et aL 9 [8]. 

It will be readily understood by those skilled in the art and it is intended here, that when 
reference is made to particular sequence listings, such reference includes sequences 
30 which substantially correspond to its complementary sequence and those described 
including allowances for minor sequencing errors, single base changes, deletions, 
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substitutions and the like, including the clonal varients set forth herein, such that any 
such sequence variation corresponds to the nucleic acid sequence of the pathogenic 
organism or disease marker to which the relevant sequence listing relates. 

5 Nucleic acid probe technology is well known to those skilled in the art who readily 
appreciate that such probes may vary greatly in length and may be labeled with a 
detectable label, such as a radioisotope or fluorescent dye, to facilitate detection of the 
probe. DNA probe molecules may be produced by insertion of a DNA molecule having 
the full-length or a fragment of the isolated nucleic acid molecule of the DNA virus into 

10 suitable vectors, such as plasmids or bacteriophages, followed by transforming into 
suitable bacterial host cells, replication in the transformed bacterial host cells and 
harvesting of the DNA probes, using methods well known in the art. Alternatively, 
probes may be generated chemically from DNA synthesizers. 

15 RNA probes may be generated by inserting the full length or a fragment of the isolated 
nucleic acid molecule of the DNA virus downstream of a bacteriophage promoter such 
as T3, T7 or SP6. Large amounts of RNA probe may be produced by incubating the 
labeled nucleotides with a linearized isolated nucleic acid molecule of the DNA virus or 
its fragment where it contains an upstream promoter in the presence of the appropriate 

20 RNA polymerase. 

As defined herein nucleic acid probes may be DNA or RNA fragments. DNA fragments 
can be prepared, for example, by digesting plasmid DNA, or by use of PCR, or 
synthesized by either the phosphoramidite method described by Beaucage and 

25 Carruthers, [19], or by the triester method according to Matteucci, et a!., [62], both 
incorporated herein by reference. A double stranded fragment may then be obtained, if 
desired, by annealing the chemically synthesized single strands together under 
appropriate conditions or by synthesizing the complementary strand using DNA 
polymerase with an appropriate primer sequence. Where a specific sequence for a 

30 nucleic acid probe is given, it is understood that the complementary strand is also 
identified and included. The complementary strand will work equally well in situations 
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where the target is a double-stranded nucleic acid. It is also understood that when a 
specific sequence is identified for use a nucleic probe, a subsequence of the listed 
sequence which is 25 basepairs or more in length is also encompassed for use as a probe. 

5 The DNA molecules of the subject invention also include DNA molecules coding for 
polypeptide analogs, fragments or derivatives of antigenic polypeptides which differ from 
naturally-occurring forms in terms of the identity or location of one or more amino acid 
residues (deletion analogs containing less than all of the residues specified for the 
protein, substitution analogs wherein one or more residues specified are replaced by other 

10 residues and addition analogs where in one or more amino acid residues is added to a 
terminal or medial portion of the polypeptides) and which share some or all properties 
of naturally-occurring forms. These molecules include: the incorporation of codons 
"preferred" for expression by selected non-mammalian hosts; the provision of sites for 
cleavage by restriction endonuclease enzymes; and the provision of additional initial, 

15 terminal or intermediate DNA sequences that facilitate construction of readily expressed 
vectors. 

Also, this invention provides an antisense molecule capable of specifically hybridizing 
with the isolated nucleic acid of the human SH3D1 A gene. This invention provides an 

20 antagonist capable of blocking the expression of the peptide or polypeptide encoded by 
the isolated DNA molecule. In one embodiment the antagonist is capable of hybridizing 
with a double stranded DNA molecule. Li another embodiment the antagonist is a triplex 
oligonucleotide capable of hybridizing to the DNA molecule. In another embodiment 
the triplex oligonucleotide is capable of binding to at least a portion of the isolated DNA 

25 molecule with a nucleotide sequence.. 

The antisense molecule may be DNA or RNA or variants thereof (i.e. DNA or RNA with 
a protein backbone). The present invention extends to the preparation of antisense 
nucleotides and ribozymes that may be used to interfere with the expression of the 
30 receptor recognition proteins at the translation of a specific mRNA, either by masking 
that MRNA with an antisense nucleic acid or cleaving it with a ribozyme. 
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Antisense nucleic acids are DNA or RNA molecules that are complementary to at least 
a portion of a specific MRNA molecule. In the cell, they hybridize to that MRNA, 
forming a double stranded molecule. The cell does not translate an MRNA in this 
double-stranded form. Therefore, antisense nucleic acids interfere with the expression 
5 of MRNA into protein. 

Antisense nucleotides or polynucleotide sequences are useful in preventing or 
diminishing the expression of the SH3D1A gene, as will be appreciated by those 
skilled in the art. For example, polynucleotide vectors containing all or a portion of 

10 the SH3D1A gene or other sequences from the SH3D1A region (particularly those 
flanking the SH3D1A gene) may be placed under the control of a promoter in an 
antisense orientation and introduced into a cell. Expression of such an antisense 
construct within a cell will interfere with SH3D1A transcription and/or translation 
and/or replication. Oligomers of about fifteen nucleotides and molecules that hybridize 

15 to the AUG initiation codon are particularly efficient, since they are easy to synthesize 
and are likely to pose fewer problems than larger molecules upon introduction to cells. 

This invention provides a transgenic nonhuman mammal which comprises at least a 
portion of the isolated DNA molecule introduced into the mammal at an embryonic stage. 
20 Methods of producing a transgenic nonhuman mammal are known to those skilled in the 
art. 

This invention also provides a method of producing a polypeptide encoded by isolated 
DNA molecule, which comprises growing the above host vector system under suitable 
25 conditions permitting production of the polypeptide and recovering the polypeptide so 
produced. 

This invention provides a polypeptide comprising the amino acid sequence of a human 
SH3D1A. In one embodiment, the amino acid sequence is set forth in Figure 5. Further, 
30 the isolated polypeptide encoded by the isolated DNA molecule may be linked to a 
second polypeptide encoded by a nucleic acid molecule to form a fusion protein by 
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expression in a suitable host cell. In one embodiment the second nucleic acid molecule 
encodes beta-galactosidase. Other nucleic acid molecules which are used to form a 
fusion protein are known to those skilled in the art. 

5 This invention provides an antibody which specifically binds to the polypeptide encoded 
by the isolated DNA molecule. In one embodiment the antibody is a monoclonal 
antibody. In another embodiment the antibody is a polyclonal antibody. The antibody 
or DNA molecule may be labelled with a detectable marker including, but not limited to: 
a radioactive label, or a colorimetric, a luminescent, or a fluorescent marker, or gold. 

10 Radioactive labels include, but are not limited to: 3 H, 14 C, 32 P, 33 P; 35 S, 36 C1, 5I Cr, 57 Co, 
59 Co, 59 Fe, 90 Y, 125 I, l31 I, and 186 Re. Fluorescent markers include but are not limited to: 
fluorescein, rhodamine and auramine. Colorimetric markers include, but are not limited 
to: biotin, and digoxigenin. Methods of producing the polyclonal or monoclonal 
antibody are known to those of ordinary skill in the art. 

15 

Further, the antibody or nucleic acid molecule complex may be detected by a second 
antibody which may be linked to an enzyme, such as alkaline phosphatase or horseradish 
peroxidase. Other enzymes which may be employed are well known to one of ordinary 
skill in the art. 

20 

"Specifically binds to an antibody" or "specifically immunoreactive with", when 
referring to a protein or peptide, refers to a binding reaction which is determinative of the 
presence of the SH3D1A of the invention in the presence of a heterogeneous population 
of proteins and other biologies including viruses other than the SH3D1 A. Thus, under 

25 designated immunoassay conditions, the specified antibodies bind to the SH3D1A 
antigens and do not bind in a significant amount to other antigens present in the sample. 
Specific binding to an antibody under such conditions may require an antibody that is 
selected for its specificity for a particular protein. For example, antibodies raised to the 
human SH3D1A immunogen described herein can be selected to obtain antibodies 

30 specifically immunoreactive with the SH3D1A proteins and not with other proteins. 
These antibodies recognize proteins homologous to the human SH3D1A protein. A 
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variety of immunoassay formats may be used to select antibodies specifically 
immunoreactive with a particular protein. For example, solid-phase ELISA 
immunoassays are routinely used to select monoclonal antibodies specifically 
immunoreactive with a protein. See Harlow and Lane [32] for a description of 
5 immunoassay formats and conditions that can be used to determine specific 
immunoreactivity. 

This invention provides a method to select specific regions on the polypeptide encoded 
by the isolated DNA molecule of the DNA virus to generate antibodies. The protein 
sequence may be determined from the cDNA sequence. Amino acid sequences may be 
analyzed by methods well known to those skilled in the art to determine whether they 
produce hydrophobic or hydrophilic regions in the proteins which they build. In the case 
of cell membrane proteins, hydrophobic regions are well known to form the part of the 
protein that is inserted into the lipid bilayer of the cell membrane, while hydrophilic 
regions are located on the cell surface, in an aqueous environment. Usually, the 
hydrophilic regions will be more immunogenic than the hydrophobic regions. Therefore 
the hydrophilic amino acid sequences may be selected and used to generate antibodies 
specific to polypeptide encoded by the isolated nucleic acid molecule encoding the DNA 
virus. The selected peptides may be prepared using commercially available machines. 
As an alternative, DNA, such as a cDNA or a fragment thereof, may be cloned and 
expressed and the resulting polypeptide recovered and used as an immunogen. 

Polyclonal antibodies against these peptides may be produced by immunizing animals 
using the selected peptides. Monoclonal antibodies are prepared using hybridoma 
25 technology by fusing antibody producing B cells from immunized animals with myeloma 
cells and selecting the resulting hybridoma cell line producing the desired antibody. 
Alternatively, monoclonal antibodies may be produced by in vitro techniques known to 
a person of ordinary skill in the art. Also as set forth earlier herein, chimeric (bi-specific) 
antibodies may be prepared by techniques well known in the art, and are likewise 
30 contemplated herein. Any and all of these antibodies are useful to detect the expression 
of polypeptide encoded by the isolated DNA molecule of the DNA virus in living 
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animals, in humans, or in biological tissues or fluids isolated from animals or humans. 

The antibodies may be detectably labeled, utilizing conventional labeling techniques 
well-known to the art. Thus, the antibodies may be radiolabeled using, for example, 
5 radioactive isotopes such as 3 H, I25 I, l3I I, and 35 S. The antibodies may also be labeled 
using fluorescent labels, enzyme labels, free radical labels, or bacteriophage labels, using 
techniques known in the art. Typical fluorescent labels include fluorescein 
isothiocyanate, rhodamine, phycoerythrin, phycocyanin, alophycocyanin, and Texas Red. 

10 Since specific enzymes may be coupled to other molecules by covalent links, the 
possibility also exists that they might be used as labels for the production of tracer 
materials. Suitable enzymes include alkaline phosphatase, beta-galactosidase, 
glucose-6-phosphate dehydrogenase, maleate dehydrogenase, and peroxidase. Two 
principal types of enzyme immunoassay are the enzyme-linked immunosorbent assay 

15 (ELISA), and the homogeneous enzyme immunoassay, also known as enzyme-multiplied 
immunoassay (EMIT, Syva Corporation, Palo Alto, CA). In the ELISA system, 
separation may be achieved, for example, by the use of antibodies coupled to a solid 
phase. The EMIT system depends on deactivation of the enzyme in the tracer-antibody 
complex; the activity can thus be measured without the need for a separation step. 

20 

Additionally, chemiluminescent compounds may be used as labels. Typical 
chemiluminescent compounds include luminol, isoluminol, aromatic acridinium esters, 
imidazoles, acridinium salts, and oxalate esters. Similarly, bio luminescent compounds 
may be utilized for labelling, the bioluminescent compounds including luciferin, 
25 luciferase, aequorin, and fluorescent proteins such as green fluorescent protein (GFP). 
Once labeled, the antibody may be employed to identify and quantify immunologic 
counterparts (antibody or antigenic polypeptide) utilizing techniques well-known to the 
art. 

30 A description of a radioimmunoassay (RIA) may be found in Laboratory Techniques in 
Biochemistry and Molecular Biology [52], with particular reference to the chapter entitled 
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"An Introduction to Radioimmune Assay and Related Techniques" by Chard, T., 
incorporated by reference herein. A description of general irnmunometric assays of 
various types can be found in the following U.S. Pat. Nos. 4,376,1 10 (David et al) or 
4,098,876 (Piasio). 

5 

One can use immunoassays to detect for the SH3D1 A gene, specific peptides, or for 
antibodies to the virus or peptides. A general overview of the applicable technology is 
in Harlow and Lane [32], incorporated by reference herein. 

10 In one embodiment, antibodies to the human SH3D1 A can be used to detect the agent in 
the sample. In brief, to produce antibodies to the agent or peptides, the sequence being 
targeted is expressed in transfected cells, preferably bacterial cells, and purified. The 
product is injected into a mammal capable of producing antibodies. Either monoclonal 
or polyclonal antibodies (as well as any recombinant antibodies) specific for the gene 

15 product can be used in various immunoassays. Such assays include competitive 
immunoassays, radioimmunoassays, Western blots, ELISA, indirect immunofluorescent 
assays and the like. For competitive immunoassays, see Harlow and Lane [32] at pages 
567-573 and 584-589. 

20 In a further embodiment of this invention, commercial test kits suitable for use by a 
medical specialist may be prepared to determine the presence or absence of 
predetermined binding activity or predetermined binding activity capability to suspected 
target cells. In accordance with the testing techniques discussed above, one class of such 
kits will contain at least the labeled polypeptide or its binding partner, for instance an 

25 antibody specific thereto, and directions, of course, depending upon the method selected, 
e.g., "competitive," "sandwich," "DASP" and the like. The kits may also contain 
peripheral reagents such as buffers, stabilizers, etc. 

Monoclonal antibodies or recombinant antibodies may be obtained by various techniques 
30 familiar to those skilled in the art. Briefly, spleen cells or other lymphocytes from an 
animal immunized with a desired antigen are immortalized, commonly by fusion with 
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a myeloma cell (see, Kohler and Milstein [50], incorporated herein by reference). 
Alternative methods of immortalization include transformation with Epstein Barr Virus, 
oncogenes, or retroviruses, or other methods well known in the art. Colonies arising 
from single immortalized cells are screened for production of antibodies of the desired 

5 specificity and affinity for the antigen, and yield of the monoclonal antibodies produced 
by such cells may be enhanced by various techniques, including injection into the 
peritoneal cavity of a vertebrate host. New techniques using recombinant phage antibody 
expression systems can also be used to generate monoclonal antibodies. See for 
example: McCafferty, J et al [64]; Hoogenboom, H.R. et al [39]; and Marks, J.D. et al. 

10 [60]. 

Such peptides may be produced by expressing the specific sequence in a recombinantly 
engineered cell such as bacteria, yeast, filamentous fungal, insect (especially employing 
baculoviral vectors), and mammalian cells. Those of skill in the art are knowledgeable 
15 in the numerous expression systems available for expression of herpes virus protein. 

Briefly, the expression of natural or synthetic nucleic acids encoding viral protein will 
typically be achieved by operably linking the desired sequence or portion thereof to a 
promoter (which is either constitutive or inducible), and incorporated into an expression 
20 vector. The vectors are suitable for replication or integration in either prokaryotes or 
eukaryotes. Typical cloning vectors contain antibiotic resistance markers, genes for 
selection of transformants, inducible or regulatable promoter regions, and translation 
terminators that are useful for the expression of viral genes. 

25 Methods for the expression of cloned genes in bacteria are also well known. In general, 
to obtain high level expression of a cloned gene in a prokaryotic system, it is advisable 
to construct expression vectors containing a strong promoter to direct mRNA 
transcription. The inclusion of selection markers in DNA vectors transformed in E. coli 
is also useful. Examples of such markers include genes specifying resistance to 

30 antibiotics. See [81] supra, for details concerning selection markers and promoters for 
use \nE. colt Suitable eukaryote hosts may include plant cells, insect cells, mammalian 
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cells, yeast, and filamentous fungi. 

The peptides derived form the nucleic acids, peptide fragments are produced by 
recombinant technology may be purified by standard techniques well known to those of 
5 skill in the art. Recombinantly produced sequences can be directly expressed or 
expressed as a fusion protein. The protein is then purified by a combination of cell lysis 
{e.g., sonication) and affinity chromatography. For fusion products, subsequent digestion 
of the fusion protein with an appropriate proteolytic enzyme releases the desired peptide. 

10 The proteins may be purified to substantial purity by standard techniques well known in 
the art, including selective precipitation with such substances as ammonium sulfate, 
column chromatography, immunopurification methods, and others. See, for instance, 
Scopes, R. [84], incorporated herein by reference. 

15 This invention is directed to analogs of the isolated nucleic acid and polypeptide which 
comprise the amino acid sequence as set forth above. The analog may have an N- 
terminal methionine or an N-terrninal polyhistidine optionally attached to the N or 
COOH terminus of the polypeptide which comprise the amino acid sequence. 

20 In another embodiment, this invention contemplates peptide fragments of the polypeptide 
which result from proteolytic digestion products of the polypeptide. In another 
embodiment, the derivative of the polypeptide has one or more chemical moieties 
attached thereto. In another embodiment the chemical moiety is a water soluble polymer. 
In another embodiment the chemical moiety is polyethylene glycol. In another 

25 embodiment the chemical moiety is mon-, di-, tri- or tetrapegylated. In another 
embodiment the chemical moiety is N-terminal monopegylated. 

Attachment of polyethylene glycol (PEG) to compounds is particularly useful because 
PEG has very low toxicity in mammals (Carpenter et al., 1971). For example, a PEG 
30 adduct of adenosine deaminase was approved in the United States for use in humans for 
the treatment of severe combined immunodeficiency syndrome. A second advantage 
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afforded by the conjugation of PEG is that of effectively reducing the immunogenicty 
and antigenicity of heterologous compounds. For example, a PEG adduct of a human 
protein might be useful for the treatment of disease in other mammalian species without 
the risk of triggering a severe immune response. The compound of the present invention 
5 may be delivered in a microencapsulation device so as to reduce or prevent an host 
immune response against the compound or against cells which may produce the 
compound. The compound of the present invention may also be delivered 
microencapsulated in a membrane, such as a liposome. 

10 Numerous activated forms of PEG suitable for direct reaction with proteins have been 
described. Useful PEG reagents for reaction with protein amino groups include active 
esters of carboxylic acid or carbonate derivatives, particularly those in which the leaving 
groups are N-hydroxysuccinimide, p-nitrophenol, imidazole or l-hydroxy-2- 
nitrobenzene-4-suIfonate. PEG derivatives containing maleimido or haloacetyl groups 

15 are useful reagents for the modification of protein free sulfhydryl groups. Likewise, PEG 
reagents containing amino hydrazine or hydrazide groups are useful for reaction with 
aldehydes generated by periodate oxidation of carbohydrate groups in proteins. 

In one embodiment, the amino acid residues of the polypeptide described herein are 
20 preferred to be in the "L" isomeric form. In another embodiment, the residues in the "D" 
isomeric form can be substituted for any L-amino acid residue, as long as the desired 
functional property of lectin activity is retained by the polypeptide. NH 2 refers to the free 
amino group present at the amino terminus of a polypeptide. COOH refers to the free 
carboxy group present at the carboxy terminus of a polypeptide. Abbreviations used 
25 herein are in keeping with standard polypeptide nomenclature, J. Biol. Chem., 243:3552- 
59(1969). 

It should be noted that all amino-acid residue sequences are represented herein by 
formulae whose left and right orientation is in the conventional direction of amino- 
30 terminus to carboxy-terminus. Furthermore, it should be noted that a dash at the 
beginning or end of an amino-acid residue sequence indicates a peptide bond to a further 
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sequence of one or more amino acid residues. 

Synthetic polypeptides, prepared using the well known techniques of solid phase, liquid 
phase, or peptide condensation techniques, or any combination thereof, can include 

5 natural and unnatural amino acids. Amino acids used for peptide synthesis may be 
standard Boc (N a -amino protected N K -t-butyloxycarbonyl) amino acid resin with the 
standard depro tec ting, neutralization, coupling and wash protocols of the original solid 
phase procedure of Merrifield (1963, J. Am. Chem. Soc. 85:2149-2154), or the base- 
labile N a -amino protected 9-fluorenylmethoxycarbonyl (Fmoc) amino acids first 

10 described by Carpino and Han (1972, J. Org. Chem. 37:3403-3409). Thus, polypeptide 
of the invention may comprise D-amino acids, a combination of D- and L-amino acids, 
and various "designer" amino acids (e.g., p-methyl amino acids, Ca-methyl amino acids, 
and Na-methyl amino acids, etc.) to convey special properties. Synthetic amino acids 
include ornithine for lysine, fluorophenylalanine for phenylalanine, and norleucine for 

15 leucine or isoleucine. Additionally, by assigning specific amino acids at specific 
coupling steps, a-helices, P turns, P sheets, y -turns, and cyclic peptides can be generated. 

In one aspect of the invention, the peptides may comprise a special amino acid at the C- 
20 terminus which incorporates either a C0 2 H or CONH 2 side chain to simulate a free 
glycine or a glycine-amide group. Another way to consider this special residue would 
be as a D or L amino acid analog with a side chain consisting of the linker or bond to the 
bead. In one embodiment, the pseudo-free C-terminal residue may be of the D or the L 
optical configuration; in another embodiment, a racemic mixture of D and L-isomers may 
25 be used. 

In an additional embodiment, pyroglutamate may be included as the N-terminal residue 
of the peptide. Although pyroglutamate is not amenable to sequence by Edman 
degradation, by limiting substitution to only 50% of the peptides on a given bead with 
30 N-terminal pyroglutamate, there will remain enough non-pyroglutamate peptide on the 
bead for sequencing. One of ordinary skill would readily recognize that this technique 
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could be used for sequencing of any peptide that incorporates a residue resistant to 
Edman degradation at the N-terminus. Other methods to characterize individual peptides 
that demonstrate desired activity are described in detail infra. Specific activity of a 
peptide that comprises a blocked N-terminal group, e.g., pyroglutamate, when the 
5 particular N-terminal group is present in 50% of the peptides, would readily be 
demonstrated by comparing activity of a completely (100%) blocked peptide with a non- 
blocked (0%) peptide. 

In addition, the present invention envisions preparing peptides that have more well 
10 defined structural properties, and the use of peptidomimetics, and peptidomimetic bonds, 
such as ester bonds, to prepare peptides with novel properties. In another embodiment, 
a peptide may be generated that incorporates a reduced peptide bond, i.e., R r CH 2 -NH- 
R 2 , where K { and R 2 are amino acid residues or sequences. A reduced peptide bond may 
be introduced as a dipeptide subunit Such a molecule would be resistant to peptide bond 
15 hydrolysis, e.g., protease activity. Such peptides would provide ligands with unique 
function and activity, such as extended half-lives in vivo due to resistance to metabolic 
breakdown, or protease activity. Furthermore, it is well known that in certain systems 
constrained peptides show enhanced functional activity (Hruby, 1982, Life Sciences 
31:189-199; Hruby et al., 1990, BiochemL 268:249-262); the present invention provides 
20 a method to produce a constrained peptide that incorporates random sequences at all 
other positions. 

A constrained, cyclic or rigidized peptide may be prepared synthetically, provided that 
in at least two positions in the sequence of the peptide an amino acid or amino acid 

25 analog is inserted that provides a chemical functional group capable of cross-linking to 
constrain, cyclise or rigidize the peptide after treatment to form the cross-link. 
Cyclization will be favored when a turn-inducing amino acid is incorporated. Examples 
of amino acids capable of cross-linking a peptide are cysteine to form disulfide, aspartic 
acid to form a lactone or a lactase, and a chelator such as y-carboxyl-glutamic acid (Gla) 

30 (Bachem) to chelate a transition metal and form a cross-link. Protected y-carboxyl 
glutamic acid may be prepared by modifying the synthesis described by Zee-Cheng and 
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Olson (1980, Biophys. Biochem. Res. Commun. 94: 1 128-1 132). A peptide in which the 
peptide sequence comprises at least two amino acids capable of cross-linking may be 
treated, e.g., by oxidation of cysteine residues to form a disulfide or addition of a metal 
ion to form a chelate, so as to cross-link the peptide and form a constrained, cyclic or 
5 rigidized peptide. 

The present invention provides strategies to systematically prepare cross-links. For 
example, if four cysteine residues are incorporated in the peptide sequence, different 
protecting groups may be used (Hiskey, 1981, in The Peptides: Analysis, Synthesis, 

10 Biology, Vol. 3, Gross and Meienhofer, eds., Academic Press: New York, pp. 137-167; 
Ponsanti et al., 1990, Tetrahedron 46:8255-8266). The first pair of cysteine may be 
deprotected and oxidized, then the second set may be deprotected and oxidized. In this 
way a defined set of disulfide cross-links may be formed. Alternatively, a pair of 
cysteine and a pair of collating amino acid analogs may be incorporated so that the cross- 

15 links are of a different chemical nature. 

The following non-classical amino acids may be incorporated in the peptide in order to 
introduce particular conformational motifs: l,2,3,4-tetrahydroisoquinoline-3-carboxylate 
(Kazmierski et al., 1991, J. Am. Chem. Soc. 113:2275-2283); (2S,3S)-methyl- 

20 phenylalanine, (2S,3R)-methyl-phenylalanine, (2R,3S)-methyl-phenylalanine and 
(2R,3R)-methyl-phenylalanine (Kazmierski and Hruby, 1991, Tetrahedron Lett.); 2- 
aminotetrahydronaphthalene-2-carboxylic acid (Landis, 1989, Ph.D. Thesis, University 
of Arizona); hydroxy-l,2,3,4-tetrahydroisoquinoline-3-carboxylate (Miyake et aL, 1989, 
J. Takeda Res. Labs. 43:53-76); p-carboline (D and L) (Kazmierski, 1988, Ph.D. Thesis, 

25 University of Arizona); HIC (histidine isoquinoline carboxylic acid) (Zechel et al., 1991, 
Int. J. Pep. Protein Res. 43); and HIC (histidine cyclic urea) (Dharanipragada). 

The following amino acid analogs and peptidomimetics may be incorporated into a 
peptide to induce or favor specific secondary structures: LL-Acp (LL-3-amino- 
30 2-propenidone-6-carboxylic acid), a p-turn inducing dipeptide analog (Kemp et al., 1985, 
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J. Org. Chem. 50:5834-5838); p-sheet inducing analogs (Kemp et aL, 1988, Tetrahedron 
Lett. 29:5081-5082); p-turn inducing analogs (Kemp et aL, 1988, Tetrahedron Lett. 
29:5057-5060); ~-helix inducing analogs (Kemp et aL, 1988, Tetrahedron Lett. 29:4935- 
4938); y- turn inducing analogs (Kemp et aL, 1989, J. Org. Chem. 54:109:115); and 

5 analogs provided by the following references: Nagai and Sato, 1985, Tetrahedron Lett. 
26:647-650; DiMaio et aL, 1989, J. Chem. Soc. Perkin Trans, p. 1687; also a Gly-Ala 
turn analog (Kahn et aL, 1989, Tetrahedron Lett. 30:2317); amide bond isostere (Jones 
et aL, 1988, Tetrahedron Lett. 29:3853-3856); tretrazol (Zabrocki et aL, 1988, J. Am. 
Chem. Soc. 110:5875-5880); DTC (Samanen et aL, 1990, Int. J. Protein Pep. Res. 

10 35:501 :509); and analogs taught in Olson et aL, 1990, J. Am. Chem. ScL 1 12:323-333 
and Garvey et aL, 1990, J. Org. Chem. 56:436. Conformationally restricted mimetics of 
beta turns and beta bulges, and peptides containing them, are described in U.S. Patent No. 
5,440,013, issued August 8, 1995 to Kahn. 

15 The present invention further provides for modification or derivatization of the 
polypeptide or peptide of the invention. Modifications of peptides are well known to one 
of ordinary skill/ and include phosphorylation, carboxymethylation, and acylation. 
Modifications may be effected by chemical or enzymatic means. In another aspect, 
glycosylated or fatty acylated peptide derivatives may be prepared. Preparation of 

20 glycosylated or fatty acylated peptides is well known in the art. Fatty acyl peptide 
derivatives may also be prepared. For example, and not by way of limitation, a free 
amino group (N-terminal or lysyl) may be acylated, e.g^ myristoylated. In another 
embodiment an amino acid comprising an aliphatic side chain of the structure - 
(CH 2 ) n CH 3 may be incorporated in the peptide. This and other peptide-fatty acid 

25 conjugates suitable for use in the present invention are disclosed in U.K. Patent GB- 
8809162.4, International Patent Application PCT/AU89/00166, and reference 5, supra. 

Mutations can be made in a nucleic acid encoding the polypeptide such that a particular 
codon is changed to a codon which codes for a different amino acid. Such a mutation is 
30 generally made by making the fewest nucleotide changes possible. A substitution 
mutation of this sort can be made to change an amino acid in the resulting protein in a 
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non-conservative manner (i.e., by changing the codon from an amino acid belonging to 
a grouping of amino acids having a particular size or characteristic to an amino acid 
belonging to another grouping) or in a conservative manner (i.e., by changing the codon 
from an amino acid belonging to a grouping of amino acids having a particular size or 

5 characteristic to an amino acid belonging to the same grouping). Such a conservative 
change generally leads to less change in the structure and function of the resulting 
protein. A non-conservative change is more likely to alter the structure, activity or 
function of the resulting protein. The present invention should be considered to include 
sequences containing conservative changes which do not significantly alter the activity 

10 or binding characteristics of the resulting protein. Substitutes for an amino acid within 
the sequence may be selected from other members of the class to which the amino acid 
belongs. For example, the nonpolar (hydrophobic) amino acids include alanine, leucine, 
isoleucine, valine, proline, phenylalanine, tryptophan and methionine. Amino acids 
containing aromatic ring structures are phenylalanine, tryptophan, and tyrosine. The 

15 polar neutral amino acids include glycine, serine, threonine, cysteine, tyrosine, 
asparagine, and glutamine. The positively charged (basic) amino acids include arginine, 
lysine and histidine. The negatively charged (acidic) amino acids include aspartic acid 
and glutamic acid. Such alterations will not be expected to affect apparent molecular 
weight as determined by polyacrylamide gel electrophoresis, or isoelectric point. 

20 

Particularly preferred substitutions are: 

- Lys for Arg and vice versa such that a positive charge may be maintained; 

- Glu for Asp and vice versa such that a negative charge may be maintained; 

- Ser for Thr such that a free -OH can be maintained; and 
25 - Gin for Asn such that a free NH 2 can ^ e maintained. 

Synthetic DNA sequences allow convenient construction of genes which will express 
analogs or "muteins". A general method for site-specific incorporation of unnatural 
amino acids into proteins is described inNoren, et al. Science, 244:182-188 (April 1989). 
30 This method may be used to create analogs with unnatural amino acids. 
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In accordance with the present invention there may be employed conventional molecular 
biology, microbiology, and recombinant DNA techniques within the skill of the art. Such 
techniques are explained fully in the literature. See, e.g., Sambrook et al, "Molecular 
Cloning: A Laboratory Manual" (1989); "Current Protocols in Molecular Biology" 

5 Volumes I-III [Ausubel, R. M., ed. (1994)]; "Cell Biology: A Laboratory Handbook" 
Volumes I-III [J. E. Celis, ed. (1994))]; "Current Protocols in Immunology" Volumes I- 
III [Coligan, J. E., ed. (1994)]; "Oligonucleotide Synthesis" (M.J. Gait ed. 1984); 
"Nucleic Acid Hybridization" [B.D. Hames & S.L Higgins eds. (1985)]; "Transcription 
And Translation" [B.D. Hames & S.J. Higgins, eds. (1984)]; "Animal Cell Culture" [ELL 

10 Freshney, ed. (1986)]; "Immobilized Cells And Enzymes" \JRL Press, (1986)]; B. Perbal, 
"A Practical Guide To Molecular Cloning" (1984). 

In an additional embodiment, pyroglutamate may be included as the N-terminal residue 
of the peptide. Although pyroglutamate is not amenable to sequence by Edman 

15 degradation, by limiting substitution to only 50% of the peptides on a given bead with 
N-terminal pyroglutatamate, there will remain enough non-pyroglutamate peptide on 
the bead for sequencing. One of ordinary skill in would readily recognize that this 
technique could be used for sequencing of any peptide that incorporates a residue 
resistant to Edman degradation at the N-terminus. Other methods to characterize 

20 individual peptides that demonstrate desired activity are described in detail infra. 
Specific activity of a peptide that comprises a blocked N-terminal group, e.g., 
pyroglutamate, when the particular N-terminal group is present in 50% of the peptides, 
would readily be demonstrated by comparing activity of a completely (100%) blocked 
peptide with a non-blocked (0%) peptide. 

25 

Chemical Moieties For Derealization. Chemical moieties suitable for derivatization 
may be selected from among water soluble polymers. The polymer selected should be 
water soluble so that the component to which it is attached does not precipitate in an 
aqueous environment, such as a physiological environment. Preferably, for therapeutic 
30 use of the end-product preparation, the polymer will be pharmaceutically acceptable. 
One skilled in the art will be able to select the desired polymer based on such 
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considerations as whether the polymer/component conjugate will be used therapeutically, 
and if so, the desired dosage, circulation time, resistance to proteolysis, and other 
considerations. For the present component or components, these may be ascertained 
using the assays provided herein, 

5 The water soluble polymer may be selected from the group consisting of, for example, 
polyethylene glycol, copolymers of ethylene glycol/propylene glycol, 
carboxymethylcellulose, dextran, polyvinyl alcohol, polyvinyl pyrrolidone, poly-1, 
3-dioxolane, poly-1, 3, 6-trioxane, ethylene/maleic anhydride copolymer, polyaminoacids 
(either homopolymers or random copolymers), and dextran or poly(n-vinyl 

10 pyrrolidone)polyethylene .glycol, propropylene glycol homopolymers, prolypropylene 
oxide/ethylene oxide co- polymers, polyoxyethylated polyols and polyvinyl alcohol. 
Polyethylene glycol propionaldenhyde may have advantages in manufacturing due to its 
stability in water. 

15 The number of polymer molecules so attached may vary, and one skilled in the art will 
be able to ascertain the effect on function. One may mono-derivatize, or may provide for 
a di-, tri-, tetra- or some combination of derivatization, with the same or different 
chemical moieties (e.g., polymers, such as different weights of polyethylene glycols). 
The proportion of polymer molecules to component or components molecules will vary, 

20 as will their concentrations in the reaction mixture. In general, the optimum ratio (in 
terms of efficiency of reaction in that there is no excess unreacted component or 
components and polymer) will be determined by factors such as the desired degree of 
derivatization (e.g., mono, di-, tri-, etc.), the molecular weight of the polymer selected, 
whether the polymer is branched or unbranched, and the reaction conditions. 

25 

The polyethylene glycol molecules (or other chemical moieties) should be attached to the 
component or components with consideration of effects on functional or antigenic 
domains of the protein. There are a number of attachment methods available to those 
skilled in the art, e.g., EP 0 401 384 herein incorporated by reference (coupling PEG to 
30 G-CSF), see also Malik et aL, 1992, Exp. HematoL 20:1028-1035 (reporting pegylation 
of GM-CSF using tresyl chloride). For example, polyethylene glycol may be covalently 
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bound through amino acid residues via a reactive group, such as, a free amino or carboxyl 
group. Reactive groups are those to which an activated polyethylene glycol molecule 
may be bound. The amino acid residues having a free amino group include lysine 
residues and the - terminal amino acid residues; those having a free carboxyl group 
5 include aspartic acid residues glutamic acid residues and the C-terminal amino acid 
residue. Sulfhydrl groups may also be used as a reactive group for attaching the 
polyethylene glycol molecule(s). Preferred for therapeutic purposes is attachment at an 
amino group, such as attachment at the N-terminus or lysine group. 

10 This invention provides a method for determining whether a subject carries a mutation 
in the SH3D1A gene which comprises: a) obtaining an appropriate nucleic acid sample 
from the subject; and(b) determining whether the nucleic acid sample from step (a) is, or 
is derived from, a nucleic acid which encodes mutant SH3D1A so as to thereby 
determine whether a subject carries a mutation in the SH3D1 A gene* In one embodiment, 

15 the nucleic acid sample in step (a) comprises mRNA corresponding to the transcript of 
DNA encoding a mutant SH3D1 A, and wherein the determining of step (b) comprises: 
(i) contacting the mRNA with the oligonucleotide under conditions permitting binding 
of the mRNA to the oligonucleotide so as to form a complex; (ii) isolating the complex 
so formed; and (iii) identifying the mRNA in the isolated complex so as to thereby 

20 determine whether the mRNA is, or is derived from, a nucleic acid which encodes mutant 
SH3D1A. In another embodiment, the determining of step (b) comprises: i) contacting 
the nucleic acid sample of step (a), and the isolated nucleic acid with restriction enzymes 
under conditions permitting the digestion of the nucleic acid sample, and the isolated 
nucleic acid into distinct, distinguishable pieces of nucleic acid; (ii) isolating the pieces 

25 of nucleic acid; and (iii) comparing the pieces of nucleic acid derived from the nucleic 
acid sample with the pieces of nucleic acid derived from the isolated nucleic acid so as 
to thereby determine whether the nucleic acid sample is, or is derived from, a nucleic acid 
which encodes mutant SH3D1 A. 

30 The present invention further provides methods of preparing a polynucleotide 
comprising polymerizing nucleotides to yield a sequence comprised of at least eight 
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consecutive nucleotides of the SH3D1A gene; and methods of preparing a polypeptide 
comprising polymerizing amino acids to yield a sequence comprising at least five 
amino acids encoded within the SH3D1 A gene. 

5 The present invention further provides methods of screening the SH3D1A gene to 
identify mutations. Such methods may further comprise the step of amplifying a portion 
of the SH3D1A gene, and may further include a step of providing a set of 
polynucleotides which are primers for amplification of said portion of the SH3D1A 
gene. The method is useful for identifying mutations for use in either diagnosis of the 

10 predisposition to, and diagnosis and treatment of megakaryocyte abnormality, 
hematopoietic disorders, myeloproliferative disorder, platelet disorder, leukemia; neural 
abnormality or other disorder; and prenatal diagnosis and treatment of tumors. Useful 
diagnostic techniques include, but are not limited to fluorescent in situ hybridization 
(FISH), direct DNA sequencing, PFGE analysis, Southern blot analysis, single 

15 stranded conformation analysis (SSCA), Rnase protection assay, allele-specific 
oligonucleotide (ASO), dot blot analysis and PCR-SSCP, as discussed in detail further 
below. 

There are several methods that can be used to detect DNA sequence variation. 

20 Direct DNA sequencing, either manual sequencing or automated fluorescent 
sequencing can detect sequence variation. For a gene as large as SH3D1A, manual 
sequencing is very labor-intensive, but under optimal conditions, mutations in the 
coding sequence of a gene are rarely missed. Another approach is the single-stranded 
conformation polymorphism assay (SSC A) (Qrita et al. 9 1989). This method does not 

25 detect all sequence changes, especially if the DNA fragment size is greater than 200 bp, 
but can be optimized to detect most DNA sequence variation. The reduced detection 
sensitivity is a disadvantage, but the increased throughput possible with SSCA 
makes it an attractive, viable alternative to direct sequencing for mutation detection on 
a research basis. The fragments which have shifted mobility on SSCA gels are then 

30 sequenced to determine the exact nature of the DNA sequence variation. Other 
approaches based on the detection of mismatches between the two complementary DNA 
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strands include clamped denaturing gel electrophoresis (CDGE) (Sheffield et al., 
1991), heteroduplex analysis (HA) (White et al., 1992) and chemical mismatch 
cleavage (CMC) (Grompe et aL, 1989). None of the methods described above will 
detect large deletions, duplications or insertions, nor will they detect a regulatory 

5 mutation which affects transcription or translation of the protein. Other methods which 
might detect these classes of mutations such as a protein truncation assay or the 
asymmetric assay, detect only specific types of mutations and would not detect 
missense mutations. A review of currently available methods of detecting DNA 
sequence variation can be found in a recent review by Grompe (1993). Once a mutation 

10 is known, an allele specific detection approach such as allele specific 
oligonucleotide (ASO) hybridization can be utilized to rapidly screen large numbers of 
other samples for that same mutation. 

A rapid preliminary analysis to detect polymorphisms in DNA sequences can be 
15 performed by looking at a series of Southern blots of DNA cut with one or more 
restriction enzymes, preferably with a large number of restriction enzymes. Each blot 
contains a series of normal individuals and a series of tumors. Southern blots 
displaying hybridizing fragments (differing in length from control DNA when probed 
with sequences near or including the SH3D1A gene) indicate a possible mutation. 
20 If restriction enzymes which produce very large restriction fragments are used, then 
pulsed field gel electrophoresis (PFGE) is employed. 

Detection of point mutations may be accomplished by molecular cloning of the 
SH3D1A allele(s) and sequencing the allele(s) using techniques well known in the 

25 art. Alternatively, the gene sequences can be amplified directly from a genomic DNA 
preparation from the tumor tissue, using known techniques. The DNA sequence of the 
amplified sequences can then be determined. There are six well known methods for 
a more complete, yet still indirect, test for confirming the presence of a susceptibility 
allele: 1) single stranded conformation analysis (SSCA) (Orita et al., 1989); 2) 

30 denaturing gradient gel electrophoresis (DGGE) (Wartell et al., 1990; Sheffield et al., 
1989); 3) RNase protection assays (Finkelstein et al., 1990; Kinszleret aL, 1991); 4) 
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allele-specific oligonucleotides (ASOs) (Conner et al., 1983); 5) the use of proteins 
which recognize nucleotide mismatches, such as the E. coli mutS protein (Modrich, 
1991); and 6) allele-specific PCR (Rano & Kidd, 1989). For allele-specific PCR, 
primers are used which hybridize at their 3* ends to a particular SH3D1A mutation. 

5 If the particular SH3D1A mutation is not present, an amplification product is not 
observed. Amplification Refractory Mutation System (ARMS) can also be used, as 
disclosed in European Patent Application Publication No. 0332435 and in Newton et al., 
1989. Insertions and deletions of genes can also be detected by cloning, sequencing and 
amplification. In addition, restriction fragment length polymorphism (RFLP) probes for 

10 the gene or surrounding marker genes can be used to score alteration of an allele or 
an insertion in a polymorphic fragment. Such a method is particularly useful for 
screening relatives of an affected individual for the presence of the SH3D1A mutation 
found in that individual. Other techniques for detecting insertions and deletions as 
known in the art can be used. 

15 

In similar fashion, DNA probes can be used to detect mismatches, through enzymatic 
or chemical cleavage. See, e.g., Cotton et al., 1988; Shenk et al., 1975; Novack et al., 
1986. Alternatively, mismatches can be detected by shifts in the electrophoretic mobility 
of mismatched duplexes relative to matched duplexes. See, e.g., Cariello, 1988. With 
20 either riboprobes or DNA probes, the cellular mRNA or DNA which might contain a 
mutation can be amplified using PCR (see below) before hybridization. Changes in DNA 
of the SH3D1 A gene can also be detected using Southern hybridization, especially if the 
changes are gross rearrangements, such as deletions and insertions. 

25 DNA sequences of the SH3D1 A gene which have been amplified by use of PCR may 
also be screened using allele-specific probes. These probes are nucleic acid oligomers, 
each of which contains a region of the SH3D1A gene sequence harboring a known 
mutation. For example, one oligomer may be about 30 nucleotides in length, 
corresponding to a portion of the SH3D1A gene sequence. By use of a battery of 

30 such allele-specific probes, PCR amplification products can be screened to identify 
the presence of a previously identified mutation in the SH3D1A gene. Hybridization 
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of allele-specific probes with amplified SH3D1A sequences can be performed, for 
example, on a nylon filter. Hybridization to a particular probe under stringent 
hybridization conditions indicates the presence of the same mutation in the tumor 
tissue as in the allele-specific probe. 

5 

Alteration of SH3D1 A mRNA expression can be detected by any techniques known 
in the art. These include Northern blot analysis, PCR amplification and RNase 
protection. Diminished mRNA expression indicates an alteration of the wild-type 
SH3D1A gene. Alteration of wild-type SH3D1A genes can also be detected by 

10 screening for alteration . of wild-type SH3D1 A protein. For example, monoclonal 
antibodies immunoreactive with SH3D1A can be used to screen a tissue. Lack of 
cognate antigen would indicate a SH3D1 A mutation. Antibodies specific for products 
of mutant alleles could also be used to detect mutant SH3D1A gene product. Such 
immunological assays can be done in any convenient formats known in the art. These 

15 include Western blots, immunohistochemical assays and ELISA assays. Any means 
for detecting an altered SH3D1 A protein can be used to detect alteration of wild-type 
SH3D1A genes. Functional assays, such as protein binding determinations, can be 
used. In addition, assays can be used which detect SH3D1 A biochemical function. 
Finding a mutant SH3D1A gene product indicates alteration of a wild-type SH3D1 A 

20 gene. Mutant SH3D1A genes or gene products can also be detected in other human 
body samples, such as serum, stool, urine and sputum. 

The present invention also provides for fusion polypeptides, comprising SH3D1A 
polypeptides and fragments. Homologous polypeptides may be fusions between two or 

25 more SH3D1A polypeptide sequences or between the sequences of SH3D1A and a 
related protein. Likewise, heterologous fusions may be constructed which would 
exhibit a combination of properties or activities of the derivative proteins. For example, 
iigand-binding or other domains may be "swapped" between different new fusion 
polypeptides or fragments. Such homologous or heterologous fusion polypeptides 

30 may display, for example, altered strength or specificity of binding. Fusion partners 
include immunoglobulins, bacterial beta -galactosidase, trpE, protein A, beta 
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-lactamase, alpha amylase, alcohol dehydrogenase and yeast alpha mating factor. See, 
e.g., Godowski et al. , 1988. Fusion proteins will typically be made by either 
recombinant nucleic acid methods, as described below, or may be chemically 
synthesized. Techniques for the synthesis of polypeptides are described, for 
5 example, in Merrifield, 1963. 

This invention provides a method for determining whether a subject has a 
megakaryocyte abnormality, myeloproliferative disorder, platelet disorder, or leukemia 
which comprises: (a) obtaining an appropriate sample from the subject; and (b) 
10 contacting the sample with the antibody so as to thereby determine whether a subject has 
the megakaryocyte abnormality, myeloproliferative disorder, platelet disorder, or 
leukemia. 

This invention provides a method for determining whether a subject has a predisposition 
15 for a megakaryocyte abnormality, myeloproliferative disorder, platelet disorder, 
leukemia or a neural abnormality or other disorder, which comprises: (a) obtaining an 
appropriate nucleic acid sample from the subject; and (b) determining whether the nucleic 
acid sample from step (a) is, or is derived from, a nucleic acid which encodes SH3D1A 
so as to thereby determine whether a subject has a predisposition for a megakaryocyte 
20 abnormality, myeloproliferative disorder, platelet disorder, or leukemia. 

This invention provides a method for determining -whether a subject has a 
megakaryocyte abnormality, myeloproliferative disorder, platelet disorder, leukemia or 
a neural abnormality or other disorder, which comprises: (a) obtaining an appropriate 

25 nucleic acid sample from the subject; and (b) determining whether the nucleic acid 
sample from step (a) is, or is derived from, a nucleic acid which encodes the human 
SH3D1 A so as to thereby determine whether a subject has megakaryocyte abnormality, 
myeloproliferative disorder, platelet disorder, leukemia or a neural abnormality or other 
disorder. In one embodiment the nucleic acid sample in step (a) comprises mRNA 

30 corresponding to the transcript of DNA encoding a human SH3D1A, and wherein the 
determining of step (b) comprises: (i) contacting the mRNA with the oligonucleotde 
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under conditions permitting binding of the mRNA to the oligonucleotide so as to form 
a complex; (ii) isolating the complex so formed; and (iii) identifying the mRNA in the 
isolated complex so as to thereby determine whether the mRNA is, or is derived from, 
a nucleic acid which encodes a human SH3D1 A. A particular finding in accordance with 
5 the invention, is that such disorders as may occur in adult brain have been observed with 
respect to the present invention, and accordingly adult patients may be diagnosed, and 
if possible, treated by the application of the inventive subject matter hereof. 

This invention provides a method of suppressing cells unable to regulate themselves 
10 which comprises introducing a purified human SH3D1A into the cells in an amount 
effective to suppress the cells. 

This invention provides a method for identifying a chemical compound which is capable 
of suppressing cells unable to regulate themselves in a subject which comprises: (a) 
15 contacting the SH3D1A with a chemical compound under conditions permitting binding 
between the SH3D1 A and the chemical compound; (b) detecting specific binding of the 
chemical compound to the SH3D1A; and (c) determining whether the chemical 
compound inhibits the SH3D1 A so as to identify a chemical compound which is capable 
of suppressing cells unable to regulate themselves. 

20 

This invention provides a method for screening a tumor sample from a human subject for 
a somatic alteration in a SH3D1 A gene in said tumor which comprises gene comparing 
a first sequence selected form the group consisting of a SH3D1 A gene from said tumor 
sample, SH3D1 A RNA from said tumor sample and SH3D1A cDNA made from mRNA 

25 from said tumor sample with a second sequence selected from the group consisting of 
SH3D1A gene from a nontumor sample of said subject, SH3D1A RNA from said 
nontumor sample and SH3D1A cDNA made from mRNA from said nontumor sample, 
wherein a difference in the sequence of the SH3D1 A gene, SH3D1A RNA or SH3D1 A 
cDNA from said tumor sample from the sequence of the SH3D1 A gene, SH3D1 A RNA 

30 or SH3D1A cDNA from said nontumor sample indicates a somatic alteration in the 
SH3D1A gene in said tumor sample. 
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This invention provides a method for screening a tumor sample from a human subject for 
the presence of a somatic alteration in a SH3D1A gene in said tumor which comprises 
comparing SH3D1A polypeptide from said tumor sample from said subject to SH3D1 A 
polypeptide from a nontumor sample from said subject to analyze for a difference 

5 between the polypeptides, wherein said comparing is performed by (i) detecting either 
a full length polypeptide or a truncated polypeptide in each sample or (ii) contacting an 
antibody which specifically binds to either an epitope of an altered SH3D1 A polypeptide 
or an epitope of a wild-type SH3D1 A polypeptide to the SH3D1 A polypeptide from each 
sample and detecting antibody binding, wherein a difference between the SH3D1A 

10 polypeptide from said tumor sample from the SH3D1 A polypeptide from said nontumor 
sample indicates the presence of a somatic alteration in the SH3D1 A gene in said tumor 
sample. 

This invention provides a method for monitoring the progress and adequacy of treatment 
15 in a subject who has received treatment for a megakaryocytic abnormality, 
myeloproliferative disorder, platelet disorder, leukemia or a condition involving a neural 
abnormality or dysfunction, which comprises monitoring the level of nucleic acid 
encoding the human SH3D1A at various stages of treatment. 

20 This invention provides a pharmaceutical composition comprising an amount of a 
polypeptide of the present invention, and apharmaceutically effective carrier or diluent. 

This invention provides a method of treating a subject having megakaryocytic 
abnormality, myeloproliferative disorder, platelet disorder, or leukemia which comprises 
25 introducing the isolated nucleic acid into the subject under conditions such that the 
nucleic acid expresses SH3D1A, so as to thereby treat the subject. 

This invention provides a method of treating a subject having megakaryocytic 
abnormality, myeloproliferative disorder, platelet disorder, or leukemia which comprises 
30 administration to the subject a therapeutically effective amount of the pharmaceutical 
composition to the subject. 
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This invention is directed to diagnostic methods and therepeutic treatments relating to 
th e following: Wilms tumor, Li-Fraumcini syndrome, retinoblastoma, familiar colon 
cancer, and acute myelogenous leukemia (AML), and myelodysplastic syndromes 
(MDSs). 

5 

Further, it is contemplated by this invention that the disclosed invention is directed to 
diversified hereditary disorders of platelet production. Heredity disorders of platelet 
production include but is not limited to: clinical problems in these disorders range from 
mild cutaneous petechiae or occasional epistaxes to severe hemorrhage requiring red 

10 cell and platelet transfusions; and abnormalities of thrombocyte structure, function, and 
number have been found by laboratory evaluation of some of these patients. 
Deviations from normality in various components of the platelet response during 
hemostatis have been well characterized in a number of families and are known to 
those skilled in the art. These include defects of platelet adhesion, secretion from 

15 storage granules, and subsequent aggregation. 

This invention provides a method of diagnosing megakaryocyte abnormality, 
myeloproliferative disorder, platelet disorder, or leukemia in a subject which comprises: 
(a) obtaining a nucleic acid molecule from a tumor lesion of the subject; (b) contacting 

20 the nucleic acid molecule with a labelled nucleic acid molecule of at least 1 5 nucleotides 
capable of specifically hybridizing with the isolated DNA, under hybridizing conditions; 
and (c) determining the presence of the nucleic acid molecule hybridized, the presence 
of which is indicative of megakaryocyte abnormality, myeloproliferative disorder, 
platelet disorder, or leukemia in the subject, thereby diagnosing megakaryocyte 

25 abnormality, myeloproliferative disorder, platelet disorder, or leukemia in the subject. 

In one embodiment the DNA molecule from the tumor lesion is amplified before step (b). 
In another embodiment PGR is employed to amplify the nucleic acid molecule. Methods 
30 of amplifying nucleic acid molecules are known to those skilled in the art. 
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In the above described methods, a size fractionation may be employed which is effected 
by a polyacrylamide gel. In one embodiment, the size fractionation is effected by an 
agarose gel. Further, transferring the DNA fragments into a solid matrix may be 
employed before a hybridization step. One example of such solid matrix is nitrocellulose 
5 paper. 

This invention provides a method of diagnosing megakaryocyte abnormality, 
myeloproliferative disorder, platelet disorder, leukemia or a neural abnormality or 
dysfunction, in a subject which comprises: (a) obtaining a nucleic acid molecule from a 

10 suitable bodily fluid of the subject; (b) contacting the nucleic acid molecule with a 
labelled nucleic acid molecules of at least 15 nucleotides capable of specifically 
hybridizing with the isolated DNA, under hybridizing conditions; and (c) determining the 
presence of the nucleic acid molecule hybridized, the presence of which is indicative of 
megakaryocyte abnormality, myeloproliferative disorder, platelet disorder, leukemia or 

15 neural abnormality or dysfunction, in the subject, thereby diagnosing megakaryocyte 
abnormality, myeloproliferative disorder, platelet disorder, or leukemia in the subject. 

This invention provides a method of diagnosing a DNA virus in a subject, which 
20 comprises (a) obtaining a suitable bodily fluid sample from the subject, (b) contacting the 
suitable bodily fluid of the subject to a support having already bound thereto a antibody, 
so as to bind the antibody to a specific antigen, (c) removing unbound bodily fluid from 
the support, and (d) determining the level of antibody bound by the antigen, thereby 
diagnosing the subject for megakaryocyte abnormality, myeloproliferative disorder, 
25 platelet disorder, leukemia or neural disorder. 

This invention provides a method of diagnosing megakaryocyte abnormality, 
myeloproliferative disorder, platelet disorder, or leukemia in a subject, which comprises 
(a) obtaining a suitable bodily fluid sample from the subject, (b) contacting the suitable 
30 bodily fluid of the subject to a support having already bound thereto an antigen, so as to 
bind antigen to a specific antibody, (c) removing unbound bodily fluid from the support, 
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and (d) determining the level of the antigen bound by the antibody, thereby diagnosing 
megakaryocytic abnormality, myeloproliferative disorder, platelet disorder, leukemia or 
neural disorder. 

5 A suitable bodily fluid includes, but is not limited to: serum, plasma, cerebrospinal fluid, 
lymphocytes, urine, transudates, or exudates. In the preferred embodiment, the suitable 
bodily fluid sample is serum or plasma. In addition, the bodily fluid sample may be cells 
from bone marrow, or a supernatant from a cell culture. Methods of obtaining a suitable 
bodily fluid sample from a subject are known to those skilled in the art. Methods of 

10 determining the level of antibody or antigen include, but are not limited to: ELIS A, IF A, 
and Western blotting. 

The diagnostic assays of the invention can be nucleic acid assays such as nucleic acid 
hybridization assays and assays which detect amplification of specific nucleic acid to 
1 5 detect for a nucleic acid sequence of the human SH3D 1 A described herein. 

Accepted means for conducting hybridization assays are known and general overviews 
of the technology can be had from a review of: Nucleic Acid Hybridization: A Practical 
Approach [72]; Hybridization of Nucleic Acids Immobilized on Solid Supports [41]; 
20 Analytical Biochemistry [4] and Innis et al, PCR Protocols [74], supra, all of which are 
incorporated by reference herein. 

Target specific probes may be used in the nucleic acid hybridization diagnostic. The 
probes are specific for or complementary to the target of interest. For precise allelic 
25 differentiations, the probes should be about 14 nucleotides long and preferably about 20- 
30 nucleotides. For more general detection of the human SH3D1A of the invention, 
nucleic acid probes are about 50 to about 1000 nucleotides, most preferably about 200 
to about 400 nucleotides. 

30 The specific nucleic acid probe can be RNA or DNA polynucleotide or oligonucleotide, 
or their analogs. The probes may be single or double stranded nucleotides. The probes 
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of the invention may be synthesized enzymatically, using methods well known in the art 
(e.g., nick translation, primer extension, reverse transcription, the polymerase chain 
reaction, and others) or chemically (e.g., by methods such as the phosphoramidite method 
described by Beaucage and Carruthers [19], or by the triester method according to 
5 Matteucci, et al [62], both incorporated herein by reference). 

An alternative means for determining the presence of the human SH3D1A is in situ 
hybridization, or more recently, in situ polymerase chain reaction. In sita PCR is 
described in Neuvo et aL [71], Intracellular localization of polymerase chain reaction 

10 (PCR)-amplified Hepatitis C cDNA; Bagasra et al [10], Detection of Human 
Immunodeficiency virus type 1 provirus in mononuclear cells by in situ polymerase chain 
reaction; and Heniford et al [35], Variation in cellular EGF receptor mRNA expression 
demonstrated by in situ reverse transcriptase polymerase chain reaction. In situ 
hybridization assays are well known and are generally described in Methods Enzymol 

15 [67] incorporated by reference herein. In an in situ hybridization, cells are fixed to a 
solid support, typically a glass slide. The cells are then contacted with a hybridization 
solution at a moderate temperature to permit annealing of target-specific probes that are 
labeled. The probes are preferably labelled with radioisotopes or fluorescent reporters. 

20 The above described probes are also useful for in-situ hybridization or in order to locate 
tissues which express this gene, or for other hybridization assays for the presence of this 
gene or its MRNA in various biological tissues. In-situ hybridization is a sensitive 
localization method which is not dependent on expression of antigens or native vs. 
denatured conditions. 

25 In brief, inhibitory nucleic acid therapy approaches can be classified into those that target 
DNA sequences, those that target RNA sequences (including pre-mRNA and mRNA), 
those that target proteins (sense strand approaches), and those that cause cleavage or 
chemical modification of the target nucleic acids. 

30 Approaches targeting DNA fall into several categories. Nucleic acids can be designed 
to bind to the major groove of the duplex DNA to form a triple helical or "triplex" 
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structure. Alternatively, inhibitory nucleic acids are designed to bind to regions of single 
stranded DNA resulting from the opening of the duplex DNA during replication or 
transcription. 

5 More commonly, inhibitory nucleic acids are designed to bind to mRNA or mRNA 
precursors. Inhibitory nucleic acids are used to prevent maturation of pre-mRNA. 
Inhibitory nucleic acids may be designed to interfere with RNA processing, splicing or 
translation. 

10 The inhibitory nucleic acids can be targeted to mRNA. In this approach, the inhibitory 
nucleic acids are designed to specifically block translation of the encoded protein. Using 
this approach, the inhibitory nucleic acid can be used to selectively suppress certain 
cellular functions by inhibition of translation of mRNA encoding critical proteins. For 
example, an inhibitory nucleic acid complementary to regions of c-myc mRNA inhibits 

15 c-myc protein expression in a human promyelocyte leukemia cell line, HL60, which 
overexpresses the c-myc proto-oncogene. See Wickstrom EX., et al. [93] and 
Harel-Bellan, A., et al [31 A]. As described in Helene and Toulme, inhibitory nucleic 
acids targeting mRNA have been shown to work by several different mechanisms to 
inhibit translation of the encoded protein(s). 

20 

Lastly, the inhibitory nucleic acids can be used to induce chemical inactivation or 
cleavage of the target genes or mRNA. Chemical inactivation can occur by the induction 
of crosslinks between the inhibitory nucleic acid and the target nucleic acid within the 
cell. Other chemical modifications of the target nucleic acids induced by appropriately 
25 derivatized inhibitory nucleic acids may also be used. 

Cleavage, and therefore inactivation, of the target nucleic acids may be effected by 
attaching a substituent to the inhibitory nucleic acid which can be activated to induce 
cleavage reactions. The substituent can be one that affects either chemical, or enzymatic 
30 cleavage. Alternatively, cleavage can be induced by the use of ribozymes or catalytic 
RNA. In this approach, the inhibitory nucleic acids would comprise either naturally 
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occurring RNA (ribozymes) or synthetic nucleic acids with catalytic activity. 

used herein, "pharmaceutical composition" could mean therapeutically effective amounts 
of polypeptide products of the invention together with suitable diluents, preservatives, 
5 solubilizers, emulsifiers, adjuvant and/or carriers useful in SCF (stem cell factor) therapy. 
A "therapeutically effective amount" as used herein refers to that amount which provides 
a therapeutic effect for a given condition and administration regimen. Such compositions 
are liquids or lyophilized or otherwise dried formulations and include diluents of various 
buffer content (e.g., Tris-HCL, acetate, phosphate), pH and ionic strength, additives such 
10 as albumin or gelatin to prevent absorption to surfaces, detergents (e.g., Tween 20, 
Tween 80, Pluronic F68, bile acid salts), soiubilizing agents (e.g., glycerol, polyethylene 
glycerol), antioxidants (e.g., ascorbic acid, sodium metabisulfite), preservatives (e.g., 
Thimerosal, benzyl alcohol, parabens), bulking substances or tonicity modifiers (e.g., 
lactose, mannitol), covalent attachment of polymers such as polyethylene glycol to the 
15 protein, complexation with metal ions, or incorporation of the material into or onto 
particulate preparations of polymeric compounds such as polylactic acid, polglycolic 
acid, hydrogels, etc, or onto liposomes, microemulsions, micelles, unilamellar or 
multilamellar vesicles, erythrocyte ghosts, or spheroplasts. Such compositions will 
influence the physical state, solubility, stability, rate of in vivo release, and rate of in vivo 
20 clearance of SCF. The choice of compositions will depend on the physical and chemical 
properties of the protein having SCF activity. For example, a product derived from a 
membrane-bound form of SCF may require a formulation containing detergent. 
Controlled or sustained release compositions include formulation in lipophilic depots 
(e.g., fatty acids, waxes, oils). Also comprehended by the invention are particulate 
25 compositions coated with polymers (e.g., poloxamers or poloxamines) and SCF coupled 
to antibodies directed against tissue-specific receptors, ligands or antigens or coupled to 
ligands of tissue-specific receptors. Other embodiments of the compositions of the 
invention incorporate particulate forms protective coatings, protease inhibitors or 
permeation enhancers for various routes of administration, including parenteral, 
30 pulmonary, nasal and oral. 
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Further, as used herein "pharmaceutically acceptable carrier" are well known to those 
skilled in the art and include, but are not limited to, 0.01-0. 1M and preferably 0.05M 
phosphate buffer or 0.8% saline. Additionally, such pharmaceutically acceptable carriers 
may be aqueous or non-aqueous solutions, suspensions, and emulsions. Examples of 

5 non-aqueous solvents are propylene glycol, polyethylene glycol, vegetable oils such as 
olive oil, and injectable organic esters such as ethyl oleate. Aqueous carriers include 
water, alcoholic/aqueous solutions, emulsions or suspensions, including saline and 
buffered media. Parenteral vehicles include sodium chloride solution, Ringer's dextrose, 
dextrose and sodium chloride, lactated Ringer's or fixed oils. Intravenous vehicles 

10 include fluid and nutrient replenishes, electrolyte replenishers such as those based on 
Ringer's dextrose, and the like. Preservatives and other additives may also be present, 
such as, for example, antimicrobials, antioxidants, collating agents, inert gases and the 
like. 



15 The term "adjuvant" refers to a compound or mixture that enhances the immune response 
f f to an antigen. An adjuvant can serve as a tissue depot that slowly releases the antigen 

0 and also as a lymphoid system activator that non-specifically enhances the immune 

fi | 

q response (Hood et al., Immunology, Second Ed., 1984, Benjamin/Cummings: Menlo 

Park, California, p. 384). Often, a primary challenge with an antigen alone, in the 

20 absence of an adjuvant, will fail to elicit a humoral or cellular immune response. 
Adjuvant include, but are not limited to, complete Freund's adjuvant, incomplete Freund's 
adjuvant, saponin, mineral gels such as aluminum hydroxide, surface active substances 
such as lysolecithin, pluronic polyols, polyanions, peptides, oil or hydrocarbon 
emulsions, keyhole limpet hemocyanins, dinitrophenol, and potentially useful human 

25 adjuvant such as BCG {bacille Calmette-Gueriri) and Corynebacterium parvum. 
Preferably, the adjuvant is pharmaceutically acceptable. 



Controlled or sustained release compositions include formulation in lipophilic depots 
(e.g. fatty acids, waxes, oils). Also comprehended by the invention are particulate 
30 compositions coated with polymers (e.g. poloxamers or poloxamines) and the compound 
coupled to antibodies directed against tissue-specific receptors, ligands or antigens or 



WO 99/53062 



PCT/US99/08371 



49 

coupled to ligands of tissue-specific receptors. Other embodiments of the compositions 
of the invention incorporate particulate forms protective coatings, protease inhibitors or 
permeation enhancers for various routes of administration, including parenteral, 
pulmonary, nasal and oral. 

5 

When administered, compounds are often cleared rapidly from mucosal surfaces or the 
circulation and may therefore elicit relatively short-lived pharmacological activity. 
Consequently, frequent administrations of relatively large doses of bioactive compounds 
may by required to sustain therapeutic efficacy. Compounds modified by the covalent 

10 attachment of water-soluble polymers such as polyethylene glycol, copolymers of 
polyethylene glycol and polypropylene glycol, carboxymethyl cellulose, dextran, 
polyvinyl alcohol, polyvinylpyrrolidone or polyproline are known to exhibit 
substantially longer half-lives in blood following intravenous injection than do the 
corresponding unmodified compounds (Abuchowski et al. ? 1981; Newmark et al., 1982; 

15 and Katre et al., 1987). Such modifications may also increase the compound's solubility 
in aqueous solution, eliminate aggregation, enhance the physical and chemical stability 
of the compound, and greatly reduce the immunogenicity and reactivity of the compound. 
As a result, the desired in vivo biological activity may be achieved by the administration 
of such polymer-compound abducts less frequently or in lower doses than with the 

20 unmodified compound. 

Dosages. The sufficient amount may include but is not limited to from about 1 fig/kg to 
about 1000 mg/kg. The amount may be 10 mg/kg. The pharmaceutical^ acceptable 
form of the composition includes a pharmaceutically acceptable carrier. 

25 

The preparation of therapeutic compositions which contain an active component is well 
understood in the art. Typically, such compositions are prepared as an aerosol of the 
polypeptide delivered to the nasopharynx or as injectables, either as liquid solutions or 
suspensions, however, solid forms suitable for solution in, or suspension in, liquid prior 
30 to injection can also be prepared. The preparation can also be emulsified. The active 
therapeutic ingredient is often mixed with excipients which are pharmaceutically 
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acceptable and compatible with the active ingredient. Suitable excipients are, for 
example, water, saline, dextrose, glycerol, ethanol, or the like and combinations thereof 
In addition, if desired, the composition can contain minor amounts of auxiliary 
substances such as wetting or emulsifying agents, pH buffering agents which enhance the 
5 effectiveness of the active ingredient. 

An active component can be formulated into the therapeutic composition as neutralized 
pharmaceutical^ acceptable salt forms. Pharmaceutically acceptable salts include the 
acid addition salts (formed with the free amino groups of the polypeptide or antibody 

10 molecule) and which are formed with inorganic acids such as, for example, hydrochloric 
or phosphoric acids, or such organic acids as acetic, oxalic, tartaric, mandelic, and the 
like. Salts formed from the free carboxyl groups can also be derived from inorganic 
bases such as, for example, sodium, potassium, ammonium, calcium, or ferric 
hydroxides, and such organic bases as isopropylamine, trimethylamine, 2-ethylamino 

15 ethanol, histidine, procaine, and the like. 

A composition comprising "A" (where "A" is a single protein, DNA molecule, vector, 
etc.) is substantially free of "B" (where "B" comprises one or more contaminating 
proteins, DNA molecules, vectors, etc.) when at least about 75% by weight of the 
20 proteins, DNA, vectors (depending on the category of species to which A and B belong) 
in the composition is "A". Preferably, "A" comprises at least about 90% by weight of the 
A+B species in the composition, most preferably at least about 99% by weight. 

The phrase "therapeutically effective amount" is used herein to mean an amount 
25 sufficient to reduce by at least about 15 percent, preferably by at least 50 percent, more 
preferably by at least 90 percent, and most preferably prevent, a clinically significant 
deficit in the activity, function and response of the host. 

According to the invention, the component or components of a therapeutic composition 
30 of the invention may be introduced parenterally, transmucosally, e.g., orally, nasally, 
pulmonarailly, or rectally, or transdermally. Preferably, administration is parenteral, e.g., 
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via intravenous injection, and also including, but is not limited to, intra-arteriole, 
intramuscular, intradermal, subcutaneous, intraperitoneal, intraventricular, and 
intracranial administration. Oral or pulmonary delivery may be preferred to activate 
mucosal immunity; since pneumococci generally colonize the nasopharyngeal and 

5 pulmonary mucosa, mucosal immunity may be a particularly effective preventive 
treatment. The term "unit dose" when used in reference to a therapeutic composition of 
the present invention refers to physically discrete units suitable as unitary dosage for 
humans, each unit containing a predetermined quantity of active material calculated to 
produce the desired therapeutic effect in association with the required diluent; Le., carrier, 

10 or vehicle. 

In another embodiment, the active compound can be delivered in a vesicle, in particular 
a liposome (see Langer, Science 249:1527-1533 (1990); Treat et al, in Liposomes in the 
Therapy of Infectious Disease and Cancer, Lopez-Berestein and Fidler (eds.), Liss, New 
15 York, pp. 353-365 (1989); Lopez-Berestein, ibid., pp. 317-327; see generally ibid). 

In yet another embodiment, the therapeutic compound can be delivered in a controlled 
release system. For example, the polypeptide may be administered using intravenous 
infusion, an implantable osmotic pump, a transdermal patch, liposomes, or other modes 

20 of administration. In one embodiment, a pump may be used (see Langer, supra; Sefton, 
CRC Crit. Ref. Biomed. Eng. 14:201 (1987); Buchwald et aL, Surgery 88:507 (1980); 
Saudek et al, N. Engl J. Med. 321:574 (1989)). In another embodiment, polymeric 
materials can be used (see Medical Applications of Controlled Release, Langer and Wise 
(eds.), CRC Pres., Boca Raton, Florida (1974); Controlled Drug Bioavailability, Drug 

25 Product Design and Performance, Smolen and Ball (eds.), Wiley, New York (1984); 
Ranger and Peppas, J. Macromol. Sci. Rev. Macromol. Chem. 23:61 (1983); see also 
Levy et al., Science 228:190 (1985); During et al., Ann. Neurol. 25:351 (1989); Howard 
et al., J. Neurosurg. 71:105 (1989)). In yet another embodiment, a controlled release 
system can be placed in proximity of the therapeutic target, i.e., the brain, thus requiring 

30 only a fraction of the systemic dose (see, e.g., Goodson, in Medical Applications of 
Controlled Release, supra, vol. 2, pp. 115-138 (1984)). Preferably, a controlled release 
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device is introduced into a subject in proximity of the site of inappropriate immune 
activation or a tumor. Other controlled release systems are discussed in the review by 
Langer 1990, Science 249:1527-1533. 

5 A subject in whom administration of an active component as set forth above is an 
effective therapeutic regimen for a bacterial infection is preferably a human, but can be 
any animal. Thus, as can be readily appreciated by one of ordinary skill in the art, the 
methods and pharmaceutical compositions of the present invention are particularly suited 
to administration to any animal, particularly a mammal, and including, but by no means 

10 limited to, domestic animals, such as feline or canine subjects, farm animals, such as but 
not limited to bovine, equine, caprine, ovine, and porcine subjects, wild animals (whether 
in the wild or in a zoological garden), research animals, such as mice, rats, rabbits, goats, 
sheep, pigs, dogs, cats, etc., Le., for veterinary medical use. 

15 In the therapeutic methods and compositions of the invention, a therapeutically effective 
dosage of the active component is provided. A therapeutically effective dosage can be 
determined by the ordinary skilled medical worker based on patient characteristics (age, 
weight, sex, condition, complications, other diseases, etc.), as is well known in the art. 
Furthermore, as further routine studies are conducted, more specific information will 

20 emerge regarding appropriate dosage levels for treatment of various conditions in various 
patients, and the ordinary skilled worker, considering the therapeutic context, age and 
general health of the recipient, is able to ascertain proper dosing. Generally, for 
intravenous injection or infusion, dosage may be lower than for intraperitoneal, 
intramuscular, or other route of administration. The dosing schedule may vary, 

25 depending on the circulation half-life, and the formulation used. The compositions are 
administered in a manner compatible with the dosage formulation in the therapeutically 
effective amount. Precise amounts of active ingredient required to be administered 
depend on the judgment of the practitioner and are peculiar to each individual. However, 
suitable dosages may range from about 0.1 to 20, preferably about 0.5 to about 10, and 
30 more preferably one to several, milligrams of active ingredient per kilogram body weight 
of individual per day and depend on the route of administration. Suitable regimes for 
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initial administration and booster shots are also variable, but are typified by an initial 
administration followed by repeated doses at one or more hour intervals by a subsequent 
injection or other administration. Alternatively, continuous intravenous infusion 
sufficient to maintain concentrations of ten nanomolar to ten micromolar in the blood are 
contemplated. 

This invention is illustrated in the Experimental Details section which follows. These 
sections are set forth to aid in an understanding of the invention but are not intended to, 
and should not be construed to, limit in any way the invention as set forth in the claims 
which follow thereafter. 

EXPERIMENTAL DETAILS SECTION 

The invention discloses a small candidate region of 50-200 kb for low platelets in 
deletion for chromosome 21. At present, the candidate region for the familial platelet 
disorder is greater than 3,000 kb, a region containing as many as 150 genes. The 
SH3D1A is mapped to the small candidate region for low platelets for chromosome 21 . 
Northern analysis using new sequence from SH3D1A reveals an abnormal band with 
significantly higher expression in RNA from lymphoblastoid cells derived from an 
affected individual vs. normal controls. DNA sequence analyses reveal homologies 
to domains that suggest involvement in developmental and/or cell regulatory 
phenomena such as lead to cancers when disturbed. These include the SH3 domains 
as well as EH domains, both associated with protein-protein interactions and the latter 
associated with maintenance of the cytoskeleton. Therefore, mutations, or increased 
or decreased expression are ultimately responsible for familial platelet disorder and 
possibly also for DS leukemias, subsets of non-DS leukemias and the processes that 
ultimately lead to abnormal platelets associated with deletion of chromosome 21. 

Materials and Methods 



Genomic clone obtained by screening the BAC library with EST: In order to study 
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the gene structure of SH3D1 A, the genomic clones were obtained by screening a human 
BAC library B with a radio-labeled EST (cDNA) (dbEST#482496, Research Genetics, 
AL) according to the procedure described by Hurbet et al., 1997, Three positive clones 
were observed. 

5 

Fluorescence in situ hybridization (FISH) to confirm the cytogenetic location of 
BAC 119E16 on chromosomes 21q22,ll-12: BAC DNAs were made as described in 
the previous publication (Hurbert et aL, 1997). The BAC DNAs as probes were 
biotinylated and FISHed onto normal human chromosome preparations following the 
10 procedure described by Korenberg and Chen (1995). BAC 1 19E16 was confirmed to map 
on chromosome 21q22.11-12 by reviewing more than 50 cells. This was further 
confirmed as well by PCR using custom-designed primers for SH3D1A based on 
sequencing information. 

15 Sequencing cDNA and part of the genomic DNA: The cDNA was sequenced using 
RT-PCR products templated on total brain cDNA or directly on BAC 1 19E16 containing 
the gene. 

Reverse transcription - polymerase chain reaction (RT-PCT): SH3D1A cDNA was 
20 amplified by RT-PCR using a standard method. Briefly, the control RNA was isolated 
from a normal male cell line using the TRI reagent kit (Molecular Research Center, Inc. 
Cincinnati, OH). The first strand of cDNA was then produced using Superscript Choice 
System (Pharmacia LKB Biotechnology). The PCR reaction was performed using 
custome designed primers with PCT-100 Programmable Thermal Controller by a 
25 standard PCR procedure. The PCR products for sequencing were prepared by 
purification with Geneclean Kit (BIO 101, Inc., Vista, CA) prior to sequencing. To 
produce clearer sequence, some PCR products were subcloned into pCR-2.1 Vector 
(CLONETECH Laboratory, Inc.) prior to sequencing. 

30 PCR of genomic DNA: three genomic (exon) fragments were generated via PCR by 
using the BAC 1 19E16 DNA as template, and purified and sequenced as described above 
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and below. 

Sequencing SH3D1 A: 

The nucleotide sequence of both the coding and non-coding strands were determined in 
5 their entirety by the dideoxy chain termination methods using the ABI PRISM Sequences 
DNA sequencing kit (PERKIN ELMER) with custom-made primers. The template for 
DNA sequencing were either PCR products or subclones as described above. 

Sequencing the upstream region of SH3D1A: 

10 In order to complete sequencing of the 5' end of SH3D1A and identify the site of 
initiation of transcription, the following two methods were utilized: 
1.5' RACE: 

5' RACE was performed by using 5' Marathon RACE kit (CLONETECH Laboratories, 
Inc. CA). The reaction products were then electrophoresed onto 1% of SeaPlaque GTG 
15 agarose (FMC BioProducts, Rockland, ME). The products with the longest srizes 
(>2Kb) were then further confirmed by sequencing nested PCR fragments. 
2. cDNA isolation from cDNA library: 

The human cDNA clones were obtained from a cDNA library screening as described in 
Yamakama et aL, ( 1 995). The cDNAs were oligo (dT) primed and cloned undirectionally 
20 into the EcoRI and Choi sites of the vector. The size of the clones were analyzed by 
electrophoresis and then using for sequencing. 

Sequencing Analysis: 

Data processing was performed using ABI Sequencing Analysis software which assessed 
25 trace quality and assembled sequence data (ABI Autoassemble program). The vector 
clipping was performed manually. To ensure the accuracy of the sequence, all regions 
of the finished sequence was covered by more than one subclone or PCR fragments, 
usually 3-5X and always were sequenced in opposite orientations. The sequence of the 
human SH3D1 A was screened against Genbank (BLASTN & BLASTX). It was also 
30 compared with the previously published SH3P17 sequence (Hsu61 166) by using V-gcg 
program* Significant differences between the previously published SH3P17 and this 
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newly sequenced SH3D1A were found. These equalled about 8% of the nucleotides. 
Previous sequence totalled only 3,230bps of the 3 ! end vs. the subject invention's 
sequence of 5,200bp. Comparison using with the complete homology sequence 
gb#AF0321 18 in Xenopus Leavis indicated the same protein start site and a similar but 
5 not identical domain structure, see Figures 1 and 2. 

SH3D1A Gene Structure: 

Protein structure was based on cNDA sequence analysis. The four SH3 domains were 
confirmed previously (Sparks et aL, 1996). However, most significant was the definition 

10 of additional domains including EH domain (Eps Homolog domain) in the N terminal 
end that have been associated with protein interactions involved with cell cycle control 
and morphogenesis. These suggested a possible role, both in human embryogenesis and 
in cancers, notably the leukemias associated with Down Syndrome (DS), the decreased 
platelets associated with deletion of chromosome 21 reported by Fannin et aL, 1995, and 

15 the familial platelet disorder reported by Dowton et al. (1985) and Ho et al. (1996), all 
of whose map positions include SH3P17. 

Gene expression study by Northern Blotting: 

Northern blots made from human multiple tissues were used to perform this study 
20 according to the manufacturer's instruction (CLONETHch Laboratory, Inc., CA). 
Referring to Figure 6, the gene was found to be expressed in all adult human tissues 
tested, those included Heart, brain, placenta, lung, liver, muscle, kidney and pancreas. 

Preparation of full length cDNA Clones corresponding to SH3D1 A 

25 A cDNA library based on fetal brain was screened in the same manner as described above 
with respect to the isolation and sequencing of SH3D1 A. Accordingly, Sequencing of 
5 different sizes of the cDNA clones was conducted, and indicated that there are at least 
three isoforms that exist. As all of the sequenced cDNA clones shown in Figure 8, #21 
was a fiilHength cDNA that contains 5438 nucleotides and codes for 1221 amino acids; 

30 #1 1 was a shorter full-length cDNA that contains 5179 nucleotides and codes for 1215 
amino acids; clone #s 5 and #9 represent 2192bp, 3193bp and 3128bp length cDNA 
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respectively, while #5 was identical to #21 and #1 1 at the 5* UTR containing only two 
EH domains. 

The comparison between cDNAs generated in this study vs previously published 
5 homologous, or the comparison between each cDNAs islated in this study, we found 
significant differences as shown in Figure 18. The differences between #21 vs ITSs, #21 
vs #1 1 and #9 vs SH3P17 are listed here: #21 is 99.8% identical to ITSs (AF064243; 
Guipponi et al, 1998) at protein level showing only 1 amino acid different at the position 
of 1 14, while at the 5' UTR, the extra 1 60bp and XXbp difference at the 3' UTR of #2 1 
10 that gives a 96,7% identity at neuleotides level; #1 1 was missing 5 amino acids at the 
position of cDNA 2573-2586 within SH3-A domain and missing 222 neucliotides within 
3* UTR region while comparing to #21; #9 was 100% identical to SH3P17 (GenBank 
Hsu61166, Sparks et al, 1996) at coding region, but it shows 76.8% identity at 
neucleotides level, the major difference is at the 3' UTR, that is a total of 222bp is 
15 missing at the position of 2189 (3963-1774) to 241 1 and presents at the same position as 
shown at #1 1 vs #21. #9 and SH3P17 only showed four SH3 domains missing SH3-C 
domain (Guipponi et al, 1998) (Figure 3). 

The homologies of ITSN to other proteins were also included in Figure 2. (Sparks et al. 
20 1996 and Guipponi et al. 1998) as discussed by Guipponi et al, 1998. 

Genomic organization of the ITSN gene and comparison to SH3P17 and ITSs/ITSl: 

The comparison of the human SH3D1A to sequenced human genomic DNA (GenBank 
No AP000050, AP000049 and AP000048) in this region on chromosome 21 revealed that 
25 this gene consistes of 29 exons (Figure 3 and Table 2 for exact exon-intron boundaries), 
the sizes of which vary from 44 to 1516 bp. The sizes of the introns range from 355bp 
Jo 7.5Kb. All introns have splice donor and acceptor sites that confirm to the general 
GT-AG consensus motif . The putative SHD1A translation initiation codon is located 
on exon 2, while the stop codon is on exon 281 

30 

Characterization of the 5 r upstream sequence 
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To determine the 5* upstream sequence of the human SH3D1 A gene, the sequence from 
PAC T1276 was used to carry out the analysis for searching the promoters). 

Complex inRNA expression on multiple adult and fetal tissues (See Figure 17: 
5 Summary of studies on ITS) 

As shown in the table and figure, Northern blot of SH3D1 A on mutiple adult and fetal 
tissues revealed unexpectedly complicated results. A total of 14 probes were used for 
expression study (Part 1). There were 6 major mRNA transcripts detected, including a 
5.4kb of mRNA fragment that was expressed ubiquitously (Heart, brain, placenta, lung, 

10 liver, muscle, kidney and pancreas) in adult and fetal tissues (brain, lung, liver and 
kidney) using any of the probes used as shown in the top portion of the Figure; a 2.5kb 
fragment expressed in adult ubiquitously, but strong in muscle while using probe #1 
(exon 1); a 2.0 kb fragment that was expressed ubiquitously in adult and fetal while using 
all of the probes except for probes #2, 3 and #12-13 (exon 2-7 and exon 28-29); the 

15 strongest expression were shown on muscle in adult and on liver and brain in fetal; a 
4.5kb fragment expressed ubiquitously, but stronger on liver, only seen in fetal while 
using probes #4, 6, 9 and 12 (exon 7 to 17 and exon 23-25; finally, a fragment larger than 
1 Ikb that was expressed specifically on brain by using probes #2 and 3 (exons 2 to 7) in 
adult and fetal tissue, and only seen in adult by using probe #9 (exon 22-28). Further, 

20 there was a small fragment 1 .0 kb also seen on liver in fetal tissue by using probes #4 and 
6 (exon 7 to 17). 
RESULTS 

The data presented herein confirm the role of the genes of the invention in conditions 
relating to leukemia as well as neural abnormalities and dysfunctions. As mentioned 
25 earlier, the genes are observed as to changes that occur in regions related to leukemia, and 
in relation to brain abnormalities observed with adult brain. The role of this family of 
genes in the regulation of both neural and leukemic conditions supports a broad 
modulatory influence on both development and homeostasis that commends their 
application in the diagnostic and therapeutic modalities presented herein. 

30 

This invention may be embodied in other forms or carried out in other ways without 



WO 99/53062 



PCT/US99/08371 



59 

departing from the spirit or essential characteristics thereof. The present disclosure is 
therefore to be considered as in all aspects illustrate and not restrictive, the scope of the 
invention being indicated by the appended Claims, and all changes which come within 
the meaning and range of equivalency are intended to be embraced therein. 

Various references have been identified and referred to herein. The disclosures of such 
ted references as well as other publications, patent disclosures or documents recited 
herein, are all incorporated herein by reference in their entireties. 
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WHAT IS CLAIMED IS : 

1. An isolated nucleic acid which encodes a human SH3D1A, including analogs, 
fragments, variants, and mutants, thereof 

2. The isolated nucleic acid of claim 1, wherein the nucleic acid has a nucleotide 
sequence having at least 85% similarity with the nucleic acid coding sequence of 
SEQ ID NO: 1, or that of Figures 8, 10, 12 or 14. 

3. The isolated nucleic acid of claim 1 , wherein the nucleic acid is DNA or RNA 

4. The isolated nucleic acid of claim 2, wherein the nucleic acid is cDNA or 
genomic DNA. 

5. The isolated nucleic acid of claim 1, wherein the nucleic acid encodes an amino 
acid sequence which forms two EH domains and four SH3 domains. 

6. The isolated nucleic acid of claim 4, wherein the nucleic acid encodes an amino 
acid sequence which forms one or more myristoylation sites in the EH domains 
and SH3 domains. 

7. The isolated nucleic acid of claim 4, wherein the nucleic acid encodes an amino 
acid sequence of the EH1 domain which corresponds to the region from about 
amino acid sequence 15 to about sequence 102 of Figure 5. 

8. The isolated nucleic acid of claim 4, wherein the nucleic acid encodes an amino 
acid sequence of the EH2 domain which corresponds to the region from about 
215 to about sequence 310 of Figure 5. 

9. The isolated nucleic acid of claim 4, wherein the nucleic acid encodes an amino 
acid sequence of the SH3-1 domain which corresponds to the region from about 
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sequence 740 to about sequence 800 of Figure 5. 

10. The isolated nucleic acid of claim 4, wherein the nucleic acid encodes an amino 
acid sequence of the SH3-2 domain which corresponds to the region from about 
sequence 908 to about sequence 966 of Figure 5. 

1 1 ♦ The isolated nucleic acid of claim 4, wherein the nucleic acid encodes an amino 
acid sequence of the SH3-3 domain which corresponds to the region from about 
sequence 999 to about sequence 1062 of Figure 5. 

12. The isolated nucleic acid of claim 4 ? wherein the nucleic acid encodes an amino 
acid sequence of the SH3-4 domain which corresponds to the region from about 
sequence 1080 to about sequence 1138 of Figure 5. 

13. The isolated nucleic acid of claim 4, wherein the nucleic acid encodes an amino 
acid sequence of the SH3-1 domain which corresponds to the region from about 
sequence 740 to about sequence 800 of Figure 5. 

14. The isolated nucleic acid of claim 1 5 wherein the nucleic acid encodes an amino 
acid sequence as set forth in Figures 5, 9, 11, 13 or 15. 

15. The isolated nucleic acid of claim 1, wherein the nucleic acid is labeled with a 
detectable marker. 



16. The isolated nucleic acid of claim 15, wherein the detectable marker is a 
radioactive isotope, a fluorophor or an enzyme. 

17. An oligonucleotide of at least 15 nucleotides capable of specifically hybridizing 
with a sequence of nucleotides present within a nucleic acid which encodes the 
human SH3D1 A of claim 1 . 
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18. The oligonucleotide of claim 17, wherein the nucleic acid is DNA or RNA. 

19. The oligonucleotide of claim 17, wherein the oligonucleotide is labeled with a 
detectable marker. 

20. The oligonucleotide of claim 19, wherein the oligonucleotide is a radioactive 
isotope, a fluorophor or an enzyme. 

21. A nucleic acid having a sequence complementary to the sequence of the isolated 
nucleic acid of claim 1. 

22. An antisense molecule capable of specifically hybridizing with the isolated 
nucleic acid of claim 1 . 

23 . A vector comprising the isolated nucleic acid of claim 1 . 

24. The vector of claim 23, further comprising a promoter of RNA transcription 
operatively, or an expression element linked to the nucleic acid. 

25. The vector of claim 23, wherein the promoter comprises a bacterial, yeast, insect 
or mammalian promoter. 

26. The vector of claim 24, further comprising plasmid, cosmid, yeast artificial 
chromosome (YAC), BAC, PI, bacteriophage or eukaryotic viral DNA. 

27. A host vector system for the production of a polypeptide which comprises the 
vector of claim 23 in a suitable host. 

28. The host vector system of claim 27, wherein the suitable host is a prokaryotic or 
eukaryotic cell. 
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The host vector system of claim 28, wherein the eukaryotic cell is a yeast, insect, 
plant or mammalian cell. 

A method for producing a polypeptide which comprises growing the host vector 
system of claim 23 under suitable conditions permitting production of the 
polypeptide and recovering the polypeptide so produced. 

A method of obtaining a polypeptide in purified form which comprises: 

(a) introducing the vector of claim 23 into a suitable host cell; 

(b) cuituring the resulting cell so as to produce the polypeptide; 

(c) recovering the polypeptide produced in step (b); and 

(d) purifying the polypeptide so recovered. 

A polypeptide comprising the amino acid sequence of a human SH3D1A. 

The polypeptide of claim 32, wherein the amino acid sequence is set forth in 
Figure 5. 

A fusion protein or chimeric comprising the polypeptide of claim 32. 

An antibody which specifically binds to the polypeptide of claim 33. 

The antibody of claim 34, wherein the antibody is selected from a chimeric 
antibody, a monoclonal antibody, and a polyclonal antibody. 

A method for determining whether a subject carries a mutation in the SH3D1 A 
gene which comprises: 

(a) obtaining an appropriate nucleic acid sample from the subject; and 

(b) determining whether the nucleic acid sample from step (a) is, or is derived 
from, a nucleic acid which encodes mutant SH3D1A so as to thereby 
determine whether a subject carries a mutation in the SH3D1 A gene. 
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38. The method of claim 36, wherein the nucleic acid sample in step (a) comprises 
mRNA corresponding to the transcript of DNA encoding a mutant SH3D1 A, and 
wherein the determining of step (b) comprises: 

(i) contacting the mRNA with the oligonucleotide of claim 1 7 under 
conditions permitting binding of the mRNA to the oligonucleotide 
so as to form a complex; 

(ii) isolating the complex so formed; and 

(iii) identifying the mRNA in the isolated complex so as to thereby determine 
whether the mRNA is, or is derived from, a nucleic acid which encodes 
mutant SH3D1A, 

39. The method of claim 29, wherein the determining of step (b) comprises: 

(i) contacting the nucleic acid sample of step (a), and the isolated 
nucleic acid of claim 1 with restriction enzymes under conditions 
permitting the digestion of the nucleic acid sample, and the 
isolated nucleic acid into distinct, distinguishable pieces of 
nucleic acid; 

(ii) isolating the pieces of nucleic acid; and 

(iii) comparing the pieces of nucleic acid derived from the nucleic acid sample 
with the pieces of nucleic acid derived from the isolated nucleic acid so 
as to thereby determine whether the nucleic acid sample is, or is derived 
from, a nucleic acid which encodes mutant SH3D1 A. 

40. A method for determining whether a subject has a megakaryocyte abnormality, 
myeloproliferative disorder, platelet disorder, leukemia or neural disorder, which 
comprises: 

(a) obtaining an appropriate sample from the subject; and 

(b) contacting the sample with the antibody of claim 35 so as to thereby 
determine whether a subject has the megakaryocyte abnormality, 
myeloproliferative disorder, platelet disorder, leukemia or neural disorder. 



WO 99/53062 



PCT/US99/08371 



65 

41. A method for determining whether a subject has a predisposition for a 
megakaryocyte abnormality, hematopoetic disorders, myeloproliferative 
disorder, platelet disorder, leukemia or neural disorder, which comprises: 

(a) obtaining an appropriate nucleic acid sample from the subject; and 

(b) determining whether the nucleic acid sample from step (a) is, or is derived 
from, a nucleic acid which encodes SH3D1 A so as to thereby determine 
whether a subject has a predisposition for a megakaryocyte abnormality, 
myeloproliferative disorder, platelet disorder, leukemia or neural disorder. 

42. The method of claim 41 , wherein the sample comprises blood, tissues or sera. 

43. A method for determining whether a subject has a megakaryocyte abnormality, 
myeloproliferative disorder, platelet disorder, leukemia or neural disorder, which 
comprises: 

(a) obtaining an appropriate nucleic acid sample from the subject; and 

(b) determining whether the nucleic acid sample from step (a) is, or is derived 
from, a nucleic acid which encodes the human SH3D1 A so as to thereby 
determine whether a subject has megakaryocyte abnormality, 
myeloproliferative disorder, platelet disorder, leukemia or neural disorder. 

44. The method of claim 44, wherein the nucleic acid sample in step (a) comprises 
mRNA corresponding to the transcript of DNA encoding a human SH3D1A, and 
wherein the determining of step (b) comprises: 

(i) contacting the mRNA with the oligonucleotide of claim 25 under 
conditions permitting binding of the mRNA to the oligonucleotide 
so as to form a complex; 

(ii) isolating the complex so formed; and 

(iii) identifying the mRNA in the isolated complex so as to thereby determine 
whether the mRNA is, or is derived from, a nucleic acid which encodes 
a human SH3D1A. 
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45. A method of suppressing cells unable to regulate themselves which comprises 
introducing a purified human SH3D1 A into the cells in an amount effective to 
suppress the cells. 

46. A method for screening a tumor sample from a human subject for a somatic 
alteration in a SH3D1A gene in said tumor which comprises gene comparing a 
first sequence selected form the group consisting of a SH3D1A gene from said 
tumor sample, SH3D1 A RNA from said tumor sample and SH3D1 A cDNA made 
from mRNA from said tumor sample with a second sequence selected from the 
group consisting of SH3D1 A gene from a nontumor sample of said subject, 
SH3D1A RNA from said nontumor sample and SH3D1A cDNA made from 
mRNA from said nontumor sample, wherein a difference in the sequence of the 
SH3D1 A gene, SH3D1 A RNA or SH3D1 A cDNA from said tumor sample from 
the sequence of the SH3D1 A gene, SH3D1 A RNA or SH3D1 A cDNA from said 
nontumor sample indicates a somatic alteration in the SH3D1A gene in said 
tumor sample. 

47. A method for screening a tumor sample from a human subject for the presence 
of a somatic alteration in a SH3D1A gene in said tumor which comprises 
comparing SH3D1A polypeptide from said tumor sample from said subject to 
SH3D1 A polypeptide from a nontumor sample from said subject to analyze for 
a difference between the polypeptides, wherein said comparing is performed by 
(i) detecting either a full length polypeptide or a truncated polypeptide in each 
sample or (ii) contacting an antibody which specifically binds to either an epitope 
of an altered SH3D1A polypeptide or an epitope of a wild-type SH3D1A 
polypeptide to the SH3D1A polypeptide from each sample and detecting 
antibody binding, wherein a difference between the SH3D1 A polypeptide from 
said tumor sample from the SH3D1A polypeptide from said nontumor sample 
indicates the presence of a somatic alteration in the SH3D1 A gene in said tumor 
sample. 
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48. A method for identifying a chemical compound which is capable of suppressing 
cells unable to regulate themselves in a subject which comprises: 

(a) contacting the SH3D1A with a chemical compound under conditions 
permitting binding between the SH3D1A and the chemical compound; 

(b) detecting specific binding of the chemical compound to the SH3D1 A; and 

(c) determining whether the chemical compound inhibits the SH3D1 A so as 
to identify a chemical compound which is capable of suppressing cells 
unable to regulate themselves. 

49. A method for monitoring the progress and adequacy of treatment in a subject who 
has received treatment for a megakaryocyte abnormality, myeloproliferative 
disorder, platelet disorder, leukemia condition or neural disorder which comprises 
monitoring the level of nucleic acid encoding the human SH3D1A at various 
stages of treatment. 

50. A method for monitoring the a prenatal for tumor risk progress or 
megakaryocyte abnormality, myeloproliferative disorder, hematopoetic disorder, 
platelet disorder, or leukemia which comprises monitoring the level of nucleic 
acid encoding the human SH3D1 A. 

51. A pharmaceutical composition comprising an amount of the polypeptide of claim 
1 and a pharmaceutically effective carrier or diluent. 

52. A method of treating a subject having megakaryocyte ~ abnormality, 
myeloproliferative disorder, platelet disorder, leukemia or neural disorder which 
comprises introducing the isolated nucleic acid of claim 1 into the subject under 
conditions such that the nucleic acid expresses SH3D1 A or its antisense nucleic 
acid, so as to thereby treat the subject. 



53. 



The method of claim 52, wherein the subject is a prenatal. 



WO 99/53062 



PCTAJS99/08371 



68 

54. A method of treating a subject having megakaryocyte abnormality, 
myeloproliferative disorder, hematopoietic disorder, platelet disorder, leukemia 
or neural disorder which comprises administration to the subject a therapeutically 
effective amount of the pharmaceutical composition of claim 51 to the subject. 

55. The method of claim 54, wherein the subject is a prenatal. 

56. The method of claim 52, wherein the administration comprises, topical, oral, 
aerosol, subcutaneous administration, infusion, intralesional, intramuscular, 
intraperitoneal, intratumoral, intratracheal, intravenous injection, or liposome- 
mediate delivery. 

57. A transgenic, nonhuman mammal comprising the isolated nucleic acid of claim 
1. 




Figure 1 
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SH3D1A 



1 CAAAAG&ATT cm3OTA03G OTXTCGXA GGAAGAATCC CGAGOGG3CT 
51 0CO3GAQ3GSA CAGAGAO30G GGCGGQGATG GIOIG03GGG CIG03GOGC 
101 TCCGTCCCTC CCAGCGGCGC GI»jXGGCA CTGATTIGIC CCT33GG033 
151 CAGCGOGGAC CmXQOGAG AIG^33CX3TC GAOTAGCAAG GTAAAACTAA 
201 CAGAACCAIG G^ITAGTTIC CAACAOCTTT TOGIGGCAGC CTGGA03&CT 
251 QOGCCATAAC TCT&GAQGAA AGAGCGAAGC ATGATCAGCA GTI^XATAGT 
301 TIAAAGCCAA TATCIQGAT1? GATTJCIGCT GATC&AGCTA GAAALTmT 
351 TTTICAAICT G3OT3CCIC AACCnX3ITIT AGCSCAGA2A TOGGCACIAG 
401 CIGACATOAA TAATCATOGA AAGOaSAGIT TXCC&lfcGCT 

451 AIGAAACTTA 1CAAACIGAA GCTACAA33A TATCAQCICAC CCICKSCTCT 
501 TCCCCCT3TC ATCAAACAX AACCACTIGC mTTICTAGC GC^OOAGCAT 
551 TTGGmTOQG AGGmTCGOC A3CASGCC2C GGCTTACAGC TOITCCTQCA 
601 GIGCX^AItSG GATCCATTCC AWflUl'lUa ATCICTCCAA CCCTAOIATC 
651 TICIGTICCC ACAGCAGCTG T30CCCCCCT GCOMO^G GCICCCCCTO 
7C1 TMACAACC TTIGCICATC CTCCAGCCAC AT3X3GCAAAG 

751 AGTICITCCI TXACT&GATC TOGTCCAGG3 TCACAACTAA ACACIAAATT 
801 ACAAAAGGCA CAGICATTIG ATOIGQCCAG TGIOXACCA GIGGCAGAOT 
851 aX5CTTGTICC TCAGICATCA ^SACTGAAAT A^AGXAATT ATTCAA2AOT 
901 CATGACAAAA CTASGAGIGG ACACTOAACA GGICCCCAAG CAAGAACIAT 
951 TCI'i m QC A G TCAAOTTTAC CACAGQCICA GCT3GCTTCA A1A1GGAATC 
1001 TITCIGACAT T^ICAAGAT GGAAAACTTA CAGCAGAGGA ATITATOCTG 
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1051 GCAATGCACC TCATTGATGI MCXMGICi 1 QGCCAACCAC OX^CACCTOT 

1101 CCTOCCTOCA GAMftCATIC GACCTXCTIT TAGAAGAGTLT CGATCTOGCA 

1151 QIGGIATATC TCTCAIAA3C TCAACATCTG TAGATCAGfcG GCXftCCAGAG 

1201 GAACCAGITr TAGAAGATGA ACAACAACAA TTAGAAAAGA AATTACCR2T 

1251 AACGTTIGAA gataagaagc gggagaactt tcaacotggc aacciqgaac 

1301 TOGAGftAACG AA03CAAGCT CTCCIQGAAC AGCAG33CAA GGAGCMGAG 

1351 G3CCIQGCXX: AGCIGGAGCG GGCQGAQCAG GAG?y33A£GG AGOGIGAGQG 

1401 CCAGGftGCAA GAGOGCAAAA GACAACTGGA ACIGGAGAAG CAACTOGAAA 

1451 ^3CAGCG33A GCTftGAAOGG CSGAGAGAGG AGGAGAGGAG GAAAGAAATT 

1501 GAG&3GOG&G AGGCTGCAAA AOGGGAACTT GAAAGGCAAC GACAACTIGA 

1551 GIG3GAA09G AATCGAAGGC AAGAACEACT AAATCAAAGA AACAAAGAAC 

1601 AAGAGGACAT ACTICTACIG AAAGCAAAGA AAAAGAdTT QGAASTEGAA 

1651 TI&GAAGCIC TAAATGATAA AAftGCAICAA CTAGASQGGA AACTTCAAGA 

1701 TATCAGATCT CGATIGAOCA CCCAAAOGCA AGAAATIGAG AGCACAAACA 

1751 AATCTAG&3A GTIGAGAATT GCCGAAATCA CCCATOTACA GGAAGAACTA 

1801 CAGGAATCTC AGCAAATGCT TCGAAGACTT ATTCCAGAAA AACAGATACT 

1851 CMTCACCAA TIAAAACAAG TICAGCAGAA CAGTTIQGAC AGAGAITCAC 

1901 TIOTIACACT TAAAAGAGCC TEftGAAGCAA AAGAACTAGC TUjGCAGCAC 

1951 CTACGAGAX AACIGGATCA AGTOGAGAAA GAAACT&3AT CAAAACTACA 

2001 GGAGATtGAT ATHTCAATA ATCAGCTGAA GGAACTAAGA GAAATCACACA 

2051 ATAAGCAACA AOTXAGAA3 CAAAAGTOCA TOGAGGCIGA ACGACTGAAA 

2101 CAGAAAGAAC AAGAACGAAA G&TCAT&GAA TTAGAAAAAC AAAAAGAAGA 

2151 AGCOCAA&SA CGA3CT2AGG AAA33GACAA CX2OT3GCIG GAGCATOIGC 

2201 AGC&GGAGGA CGftGCATCAG AGAOCAAGAA AACTCCAOGA AGAGGAAAAA 

2251 CTSAAAAQGG AGGAG^GIOT CAAAAAGAAG GATOGQGAGG AAAAAGQCAA 
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2301 &2K3GAAGCA CAAGACAAGC TOC33T033Cr mCCAlCAA CACCAAGAAC 
2351 CAGCTAAGCC AGCTGICC2G QCACCX7IQGT 030GCAGA AAAAG3ICCA 
2401 CTTACCATIT CIQCACAGGA AAAICIMAA GIGGTOIATT ACCX3GGCACT 
2451 CTACCCCTTT G^TCCTG&A GCCMGAIGA AATCfcCTATC CAGCCAGGAG 
2501 ACATAGTCAT GGTO»T3AA AGCX3^AACT3 GAGAAGOCGG CTGQCTIGGA 
2551 GG&GAA371&A AAGGAAAGAC CCTOCAAACT ATGCACiAGAA 

2601 AATCCCAGAA AASGAGGTIC CCGCTCGftOT GAAACCAGIG AC1GATICAA 
2651 CATCTCCCCC OtXCOXAAA CTOGCCTIGC CTSftGACCCC GGCGQCTTTG 
2701 (X2CTAACCT CTICAGAGCC CTCCAQGAOC CCTAAIAACT GGGCCGACTT 
2751 CAGCntCAOG TQGaXACCA G3&3AA2GA GAAACCAGAA AC^TAACT 
2801 QQGftUGCMG GGCAGCGCAG CCCTCTCIC^ COGTTCO^G TCCCX3G0CAG 
2851 TTAAGGCAGA GGTCHSCCTT TACIOMCC AOSSaCAClG CX3CCTCCCC 
2901 G1CICCIGIG CT*30CXfcQG GIGAAAAGOT QG&GQQGCTA CAAGCICAAG 
2951 CCX^TATCX: OTGGAGAGQC AAAAAAGACA ACCACTI&AA TITIAACAAA 
3001 AATCATGTCA TCACXGICCT OGAACAGCAA GACATXSTCOT Q3ITIQG&GA 
3051 AOTICAAGGT CAGAAQ33IT GCJTTOCCGAA GICITACSIG AAACTCMTT 
3101 CAC<3GCOC^T AAGGAAGICT ACAAGCAIGG ATICIGOTIC TTCAGAGftCT 
3151 CCIGCTftGTC TAAAGOGACT ^CTCICCA GC^3CCAAGC CQQIQ3TTIC 
3201 GGGAGAAGAA ATIQOCCAGG TlWi'llCCTC A03CACOGCC ACD3GCCCCG 

3251 agcagctcac Tcraxmrr qgicmciga titigatccg aaaaaas^ac 

3301 CCAGGTOGAT GGIGGGAAGG AGAQCT3CAA GCAGGIQGGA AAAAGCGCCA 

3351 GATAGGCTGG TIUXAGCIA A1TA!IX?IAAA GCITCTAAGC CCIQGGACGA 

3401 QCAAAATCAC TCCAACAGAG CCACCTAAGT CAACAQCAIT AGOGGCaGIG 

3451 T3CGMOTGA TIGGGATOIA CGACTftCACC GQGCAGAATC ACGA1GAGCT 
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3501 GGOCITCAAC AAGGGCCAGA QOflCAMGrT CCOTACAAG GMGftOOCIG 

3551 /CTX3SIGGAA AGGAGSAOTC MTOG&CAAG TGGGGCICXT CCCAICCAAT 

3601 TATOIGAAGC 2GACOOGA CAIGGAQXA AGCCAGCAAT GAATCAIMG 

3551 TI GI CGATCC CCOCUICAQG CTIGAA&3IC CICAAAGAGA CCCACTA2CC 

3701 oasacacas cocsgagosa tcatcggaga tocagcctig atcatctoac 

3751 TICCAGCATG ATO£CT£CT GCCI'lClGRO TAGAAGAACT CACK3CAGAG 

3801 CAGTTCACCT CAXTmcCT TAQTIGCATG TCATOGGAAT GCTIGAOITA 

3851 TEACITCCAG AGAXAQGAQC AAAAA1TACA AAAACACACA 

3901 TCCTTTIGIG GOTITCCTAG TIACICAAAT TCAOTITCCC 

3951 ACAGSIGCTT TCAATAGJTT TAAAATmiT TTEAAA1S22A TAOTTOGCT 

4001 TTITAATAAA CAAAATAAAT AAA73ACFIC TOCTITIOCA 

4051 AAAAGACCCA cmTCA&GGA ATCCT3CATG TOCTATIAAA AftTlUl'lUCA 

4101 AAOTICCATA AATCIGSGAC TIGAT3EATT TITICATTIT GTCCACTGTT 

4151 ACXZAACTAAA TIGCTQCAGT TIGOGGCITr TCQOCCTTAC CATAGAAG7IG 

4201 CAGAQGSGIT CAGSATCICT GTITIAAAGA CDIMAGAAT GAGCCXAATT 

4251 AAAGCGAAGG TGATIUIGCT TQTITGIGIG TATCAGCT3T ACLTIWIIGA 

4301 QCATOIAATA CAIOT3EAC ATAAGAAATT AGTICITICC A3X3GCAAAGC 

4351 TAITADCTIG TACGATOCIC TAATCASATT GCAITEAATT TTAilTIUCA 

4401 aCAGTOACCT TC0X3CCACA TGAGAAAGCA CICIUIGTIT TTGOT03SIC 

4451 TCAGATTEAT CIOTTIGAGT TOCJIOTTTIG TTIGGGGTIT TIAATITTGC 

4501 G mmUU KT AGCATAAAAT CACTftGACAA CAGCACIGAG GTCCT1AOGA 

4551 TCAAOGAIAT CCACAGTCTC TlTl'iA GICT ClOriKaEDS MbTlTlKET 

4601 CCA^TCACIT TICATOGAAT GACCHALLTIT GAACAAGHAA TITXCTIGAC 

4651 AAGAAAGAAT OEATAGAAGT CTCOCTQCAA TIAATITCCA ATCTTIACAT 
4701 ITITIA ACZA GGACTCIGGA ATITCBOG ATEAATAIGA AAJQGAGCTC 
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4751 


ATGGTCCGTT TOT&TUITAG AOMGCIGIA GCKSAAGOCC 




4801 


TI2AACACIA UX'iUUAAXT CBJAA3AAAA ATGCCIOCIG 




4851 


AGAAAA.TO3G GCAGQOQSftG GCICftflGCAC MTCIftSCIG 


TCCTCCTAAA 


4901 


GACICICTAA T3CTCAATCC OCTIOXSI'IC TOCD3303CT 


GfTCGQGflGQC 


4951 


1GTOC1Q3IG GICOiQISGA OJiOL'lTi'lU CTTCCAAATG 


GIQCAGRG&G 


5001 


AG&GGaCCTT 'lLUlCCi'lWi' TCaGTTOCAA TICSGIATrT 


ICAOSGATOT 


5051 


GAATCTAAM. TAlKCftMTA 1AEAAACCTG K3GAXITAAC 


AAATOERAAA 


5101 


CAflCCi'i'l'iG AATXM3TICC GftOIATSGAT AftTEAAATIT 


TTkAA&ZAAA 


5151 


ASEAAAAAAA AAAAAAAftAA AAAAAAA&AA, AAAAGI03AC 


GCG3C0GO3 
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SH3D1A Translated Protein Sequence: 

1 MAQFFOPFGG 3LDIWAITVE ERAKKDQQFH SLKPISGFIT GDQARNFFFQ 

51 SGLFQPVLftQ IWALAEMNND GRMDQVEFSI AMKLIKLKLQ CTQLPSALPP 

101 VMK32PVMS SAPAPGM3GI ASMPPLTAVA FVJM3SIPW GMSPXLVSSV 

151 PIAAVPPLAN GAPPVIQPLP AFAHPAATLP XSSSESRSGP GSQUJIKD3K 

201 AQSTOVASVP PVAEKAVPQS SRLKYRQLFN SfOCIWSGHL TCPQARmtf 

251 QSSLPQfiQLA SH*CSDIDQ DGKL1AEEFI IAMHLIEVAM SG2PLPFVL3? 

301 PEXIPPSFRR VRSGSGISVI SSISVDQPLP EEPVLED3QQ QLEKKLPVTF 

351 EDKKRENFER C3CjELEKRBQ ALLE2QRKEQ ERLftQLERAE QERKEREPQE 

401 QEKKRQLELE KQLSKQRELE PQREEERRKE IERREAAKRE I^RQRQLEWE 

451 RNREQELLNQ RNKEQEQIVV LKAKKKTLEF ELSMM3CKH QLB3KI£DIR 

501 CRLTTQRQEI ES1NKSRELR IAETEHLQQQ LQESQ3ylIjt2l LIEERQHf© 

551 QLKQVIS^SL HKDSLVTLKR ALEAKELAFQ HISDQLEEVE K2mSKI£EI 

601 nSTOL REIHNKQQLQ KQKSMEAERL KQKB3ERKII ELEK^KEEAQ 
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651 HRAQEBEKQW LEHVQQEBEH QHPRKLHEEE KLKREESVKK KDGEEKGKQE 

701 AQDKLGRLiFH {J3QEEfcKPAV QftFWSiaEKS PLTIS^ENV KWYYRALYP 

751 FESRSHDEUT IQPGHIVMVD SSQP3EPGWL GGELKQCTGW FEAMYAESCEP 

801 ENEVPAPVK? VSDSESAfiftP KLALHBTPAP IAVTSSEPST TH&W&DFSS 

851 IWPTSTOEKP HHMCfiWaA QPSLTVPSB3 QLEQRSAFTP MMGSSPSP 

901 VL333EKVEG D2^CALVFWR AKKENHLMFN KtOOTVlSQ OMWFGEVQ 

951 Q3KGWFPKSY VKLISGPIRK OTSMDSGSSS SPASLKKVAS PAAKFWS3E 

1001 EXPQJT&ST? ATCFEQUIIA FQQLILIKKK NFGGWEJ^L QAHGKKRQIG 

10S1 WFPANWKLL SKttSKilVl 1 EPFKSTMAA VOQVIGKiOT 

1101 NKQQXIItM^ KEDPC3WWKGE VNGQVGLFPS PSQ 
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I GCACGAGAGG GAGCGAAGGA GGTAGAGAAG AGTGGAGGCG CCAGGGGAGG 
5 1 G AGCGTAGCT TGGTTGCTCC GTAGTACGGC GGCTCGCGAG GAAGAATCCC 
101 GAGCGGGCTC CGGGACGGAC AGAGAGGCGG GCGGGGATGG TGTGCGGGGC 
151 TGCGGCTCCT GCGTCCCTCC CAGCGGCGCG TGAGCGGCAC TGATTTGTCC 
20 1 CTGGGGCGGC AGCGCGGACC CGCCCGGAGA TGAGGCGTCG ATTAGCAAGG 
25 1 TAAAAGTAAC AGAACCATGG CTCAGTTTCC AACACCTTTT GGTGGCAGCC 
30 1 TGGATATCTG GGCCATAACT GTAGAGGAAA GAGCGAAGCA TGATCAGCAG 
3 5 1 TTCCATAGTT TAAAGCC AAT ATCTGG ATTC ATTACTGGTG ATCAAGCTAG 
40 1 AAACTTTTTT TTTCAATCTG GGTTACCTCA ACCTGTTTTA GC ACAGATAT 
45 1 GGGCACTAGC TGACATGAAT AATGATGGAA GAATGGATCA AGTGGAGTTT 
501 TCCATAGCTA TGAAACTTAT CAAACTGAAG CTACAAGGAT ATCAGCTACC 
55 1 CTCTGCACTT CCCCCTGTCA TGAAACAGCA ACCAGTTGCT ATTTCTAGCG 
601 CACCAGCATT TGGTATGGGA GGTATCGCCA GCATGCCACC GCTTACAGCT 
65 1 GTTGCTCC AG TGCCAATGGG ATCCATTCCA GTTGTTGGAA TGTCTCCAAC 
701 CCTAGTATCT TCTGTTCCCA CAGCAGCTGT GCCCCCCCTG GCTAACGGGG 
75 1 CTCCCCCTGT TATACAACCT CTGCCTGCAT TTGCTCATCC TGCAGCCACA 
80 1 TTGCCAAAGA GTTCTTCCTT TAGTAGATCT GGTCC AGGGT CACAACTAAA 
85 1 CACTAAATTA CAAAAGGCAC AGTCATTTGA TGTGGCCAGT GTCCCACCAG 
901 TGGCAGAGTG GGCTGTTCCT CAGTCATCAA GACTGAAATA CAGGCAATTA 
95 1 TTCAATAGTC ATGACAAAAC TATGAGTGGA CACTTAACAG GTCCCCAAGC 
100 1 AAG AACTATT CTTATGCAGT CAAGTTTACC ACAGGCTCAG CTGGCTTCAA 
1 05 1 TATGG AATCT TTCTG ACATT G ATC AAGATG GAAAACTT AC AGC AG AGGAA 
1101 TTTATCCTGG CAATGCACCT CATTGATGTA GCTATGTCTG GCCAACCACT 
1151 GCCACCTGTC CTGCCTCCAG AATACATTCC ACCTTCTTTT AGAAGAGTTC 
1201 GATCTGGCAG TGGTATATCT GTCATAAGCT CAACATCTGT AGATCAGAGG 
125 1 CTACCAGAGG AACCAGTTTT AGAAGATGAA CAACAACAAT TAGAAAAGAA 
1301 ATTACCTGTA ACGTTTGAAG ATAAGAAGCG GGAGAACTTT GAACGTGGCA 
1351 ACCTGGAACT GGAGAAACGA AGGCAAGCTC TCCTGGAACA GCAGCGCAAG 
1401 GAGCAGGAGC GCCTGGCCCA GCTGGAGCGG GCGGAGCAGG AGAGGAAGGA 
145 1 GCGTGAGCGC CAGGAGCAAG AGCGCAAAAG ACAACTGGAA CTGGAGAAGC 
1501 AACTGGAAAA GCAGCGGGAG CTAGAACGGC AGAGAGAGGA GGAGAGGAGG 
155 1 AAAGAAATTG AGAGGCGAGA GGCTGCAAAA CGGGAACTTG AAAGGCAACG 
1601 ACAACTTGAG TGGGAACGGA ATCGAAGGCA AGAACTACTA AATCAAAGAA 
165 1 ACAAAGAACA AGAGGACATA GTTGTACTGA AAGCAAAGAA AAAGACTTTG 
1701 GAATTTGAAT TAGAAGCTCT AAATGATAAA AAGCATCAAC TAGAAGGGAA 
175 1 ACTTCAAGAT ATCAGATGTC GATTGACCAC CCAAAGGCAA GAAATTGAGA 
1801 GCACAAACAA ATCTAGAGAG TTGAGAATTG CCGAAATCAC CCATCTACAG 
185 1 CAACAATTAC AGGAATCTCA GCAAATGCTT GGAAGACTTA TTCCAGAAAA 
1901 ACAGATACTC AATGACCAAT TAAAACAAGT TCAGCAGAAC AGTTTGCACA 
195 1 GAGATTCACT TGTTACACTT AAAAGAGCCT TAGAAGCAAA AGAACTAGCT 
2001 CGGCAGCACC TACGAGACCA ACTGGATGAA GTGGAGAAAG AAACTAGATC 
205 1 AAAACTAC AG GAG ATTGATA TTTTCAATAA TCAGCTGAAG GAACTAAGAG 
2101 AAATACACAA TAAGCAACAA CTCCAGAAGC AAAAGTCCAT GGAGGCTGAA 
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CGACTGAAAC AGAAAGAACA AGAACGAAAG ATCATAGAAT TAGAAAAACA 
AAAAGAAGAA GCCCAAAGAC GAGCTCAGGA AAGGGACAAG CAGTGGCTGG 
AGCATGTGCA GCAGGAGGAC GAGCATCAGA GACCAAGAAA ACTCCACGAA 
GAGGAAAAAC TGAAAAGGGA GGAGAGTGTC AAAAAGAAGG ATGGCGAGGA 
AAAAGGCAAA CAGGAAGCAC AAGACAAGCT GGGTCGGCTT TTCCATCAAC 
ACCAAGAACC AGCTAAGCCA GCTGTCCAGG CACCCTGGTC CACTGCAGAA 
AAAGGTCCAC TTACCATTTC TGCACAGGAA AATGTAAAAG TGGTGTATTA 
CCGGGCACTG TACCCCTTTG AATCCAGAAG CCATGATGAA ATCACTATCC 
AGCCAGGAGA CATAGTCATG GTTAAAGGGG AATGGGTGGA TGAAAGCCAA 
ACTGGAGAAC CCGGCTGGCT TGGAGGAGAA TTAAAAGGAA AGACAGGGTG 
GTTCCCTGCA AACTATGCAG AGAAAATCCC AGAAAATGAG GTTCCCGCTC 
CAGTGAAACC AGTGACTGAT TCAACATCTG CCCCTGCCCC CAAACTGGCC 
TTGCGTGAGA CCCCCGCCCC TTTGGCAGTA ACCTCTTCAG AGCCCTCCAC 
GACCCCTAAT AACTGGGCCG ACTTCAGCTC CACGTGGCCC ACCAGCACGA 
ATGAGAAACC AGAAACGGAT AACTGGGATG CATGGGCAGC CCAGCCCTCT 
CTCACCGTTC CAAGTGCCGG CCAGTTAAGG CAGAGGTCCG CCTTTACTCC 
AGCC ACGGCC ACTGGCTCCT CCCCGTCTCC TGTGCTAGGC CAGGGTGAAA 
AGGTGGAGGG GCTACAAGCT CAAGCCCTAT ATCCTTGGAG AGCCAAAAAA 
GACAACCACT TAAATTTTAA CAAAAATGAT GTCATCACCG TCCTGGAACA 
GCAAGACATG TGGTGGTTTG GAGAAGTTCA AGGTCAGAAG GGTTGGTTCC 
CCAAGTCTTA CGTGAAACTC ATTTCAGGGC CCATAAGGAA GTCTACAAGC 
ATGGATTCTG GTTCTTCAGA GAGTCCTGCT AGTCTAAAGC GAGTAGCCTC 
TCCAGCAGCC AAGCCGGTCG TTTCGGGAGA AGAATTTATT GCCATGTACA 
CTTACG AGAG TTCTGAGCAA GGAGATTTAA CCTTTCAGCA AGGGGATGTG 
ATTTTGGTTA CCAAGAAAGA TGGTGACTGG TGGACAGGAA CAGTGGGCGA 
CAAGGCCGGA GTCTTCCCTT CTAACTATGT GAGGCTTAAA GATTCAGAGG 
GCTCTGGAAC TGCTGGGAAA ACAGGGAGTT TAGGAAAAAA ACCTGAAATT 
GCCCAGGTTA TTGCCTCATA CACCGCCACC GGCCCCGAGC AGCTCACTCT 
CGCCCCTGGT CAGCTGATTT TGATCCGAAA AAAGAACCCA GGTGGATGGT 
GGGAAGGAGA GCTGCAAGCA CGTGGGAAAA AGCGCCAGAT AGGCTGGTTC 
CCAGCTAATT ATGTAAAGCT TCTAAGCCCT GGGACGAGCA AAATCACTCC 
AACA GAGCCA CCTAAGTCAA CAGCATTAGC GGCAGTGTGC CAGGTGATTG 
GGATGTACGA CTACACCGCG CAGAATGACG ATGAGCTGGC CTTCAACAAG 
GGCCAGATCA TCAACGTCCT CAACAAGGAG GACCCTGACT GGTGGAAAGG 
AGAAGTCAAT GGACAAGTGG GGCTCTTCCC ATCCAATTAT GTGAAGCTGA 
CCACAGACAT GGACCCAAGC CAGCAATGAA TCATATGTTG TCCATCCCCC 
CCTC AGGCTT GAAAGTCCTC AAAGAGACCC ACTATCCCAT ATCACTGCCC 
AGAGGGATGA TGGGAGATGC AGCCTTGATC ATGTGACTTC CAGCATGATC 
ACCTACTGCC TTCTGAGTAG AAG AACTCAC TGCAGAGCAG TTTACCTCAT 
TTTACCTTAG TTGC ATGTGA TCGCAATGTT TGAGTTATTA CTTGCAGAGA 
TAGGAGCAAA AATTACAAAA ACACACAGGG TAGTGGGTCC TTTTGTGGCT 
TTCCTAGTTA CTCAAATTGA CTTTCCCCCA CCTTTGCACA GGTGCTTTCA 
ATAGTTTTAA AATTATTTTT AAATATATAT TTTAGCTTTT TAATAAACAA 
AATAAATAAA TGACTTCTTT GCTATTTTGG TTTTGCAAAA AGACCCACTA 
TCAAGGAATG CTGCATGTGC TATTAAAAAT TGTTCCAAAT GTCCATAAAT 
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440 1 CTGAGACTTG ATGTATTTTT TCATTTTGTC CAGTGTTACC AACTAAATTG 
4451 TGCAGTTTGG GGCTTTTCCC CCTTACCATA GAAGTGCAGA GGAGTTCAGT 
4501 ATCTCTGTTT TAAAGACGTA TAGAATGAGC CCAATTAAAG CGAAGGTGTT 
455 1 TGTGCTTGTT TGTGTGTATC AGCTGTACCT TGTTGAGCAT GTAATACATC 
4601 CTGTACATAA GAAATTAGTT CTTTCCATGG CAAAGCTATT ACCTTGTACG 
465 1 ATGCTCTAAT CATATTGCAT TTAATTTTAT TTTGCACAGT GACCTTGTAG 
4701 CCACATGAGA AAGCACTCTG TGTTTTTGTT CGGTCTCAGA TTTATCTGGT 
475 1 TGAGTTGGTG TTTTGTTTGG GGTTTTTAAT TTTGCGTGTT TGCATAGCAT 
4801 AAAATCAGTA GACAACACCA CTGAGGTCGT TACGATCAAC GATATCCACA 
485 1 GTCTCTTTTT AGTCTCTGTT ACATGAAGTT TTATTCCAGT TACTTTTCAT 
4901 GGAATGACCT ATTTTGAACA AGTAATTTTC TTGACAAGAA AGAATGTATA 
495 1 GAAGTCTCCC TGC AATTAAT TTCCAATGTT TACATTTTTT AACTAG ACTG 
5001 TGGAATTTCT ACAGATTAAT ATGAAATGGA GCTCATGGTC CGTTTGTGTG 
505 1 TTAG ATATGC TGTAGCTGAA GCCCTGTTTG TCTTTTAAAC ACTAGTTGGA 
5101 AGCTCTCAAT AAAAATGCCT GCTGCTCACA GCACAGAAAA TGGGGCAGGG 
5151 GGAGCCTCAA GCACAATCTA GCTGTCCTCC TAAAGACTCT GTAATGCTCA 
5201 CTCCCCTCGC GTTCTCCCGG CGCTGTCGGG AGGCTGTGCT GGTGGTCGTG 
525 1 TAGAGGTCCT TCTCCTTTCA C ATGGTGCAG AGAGCGAGGA CCTCTCCTCC 
5301 TCGTTCAGTT GCACTTCAGT ATTTTCACGG ATATGAATGT AAAATATATA 
535 1 AATATATAAA CCTGCGGCTT TAACAACTGT AATACAACCT TTTGAATTAG 
5401 TTCCGTGTAT AGATAATTAA ATTCTTCATA CAAAAGTTAA AAAAAAAAAA 
5451 AAAAAAAA 
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#21 translated protein sequence: 



1 MAQFPTPFGG SLDIWAITVE ERAKHDQQFH SLKPISGFIT GDQARNFFFQ 
51 SGLPQPVLAQ IWALADMNND GRMDQVEFSI AMKLIKLKLQ GYQLPSALPP 
101 VMKQQPVAIS SAPAFGMGGI ASMPPLTAVA PVPMGSIPW GMSPTLVSSV 
151 PTAAVPPLAN GAPPVIQPLP AFAHPAATLP KSSSFSRSGP GSQLNTKLQK 
201 AQSFDVASVP PVAEWAVPQS SRLKYRQLFN SHDKTMSGHL TGPQARTELM 
251 QSSLPQAQLA SIWNLSDIDQ DGKLTAEEFI LAMHLIDVAM SGQPLPPVLP 
301 PEYIPPSFRR VRSGSGISVI SSTSVDQRLP EEPVLEDEQQ QLEKKLPVTF 
35 1 EDKKRENFER GNLELEKRRQ ALLEQQRKEQ ERLAQLERAE QERKERERQE 
401 QERKRQLELE KQLEKQRELE RQREEERRKE IERREAAKRE LERQRQLEWE 
451 RNRRQELLNQ RNKEQEDIW LKAKKKTLEF ELEALNDKKH QLEGKLQDIR 
501 CRLTTQRQEI ESTNKSRELR IAEITHLQQQ LQESQQMLGR LDPEKQILND 
55 1 QLKQ VQQNSL HRDSLVTLKR ALEAKELARQ HLRDQLDEVE KETRSKLQEI 
601 DIFNNQLKEL REIHNKQQLQ KQKSMEAERL KQKEQERKII ELEKQKEEAQ 
65 1 RRAQERDKQW LEHVQQEDEH QRPRKLHEEE KLKREESVKK KDGEEKGKQE 
701 AQDKLGRLFH QHQEPAKPAV QAPWSTAEKG PLTISAQENV KVVYYRALYP 
75 1 FESRSHDEIT IQPGDIVMVK GEWVDESQTG EPGWLGGELK GKTGWFPANY 
801 AEKIPENEVP APVKPVTDST SAPAPKLALR ETPAPLAVTS SEPSTTPNNW 
85 1 ADFSSTWPTS TNEKPETDNW DAWAAQPSLT VPSAGQLRQR SAFTPATATG 
901 SSPSPVLGQG EKVEGLQAQA LYPWRAKKDN HLNFNKNDVI TVLEQQDMWW 
951 FGEVQGQKGW FPKSYVKLIS GPIRKSTSMD SGSSESPASL KRVASPAAKP 
1001 WSGEEFIAM YTYESSEQGD LTFQQGDVIL VTKKDGDWWT GTVGDKAGVF 
1051 PSNYVRLKDS EGSGTAGKTG SLGKKPEIAQ VIASYTATGP EQLTLAPGQL 
1 101 ILIRKKNPGG WWEGELQARG KKRQIGWFPA NYVKLLSPGT SKITPTEPPK 
1151 STALAAVCQV IGMYDYTAQN DDELAFNKGQ IINVLNKEDP DWWKGEVNGO 
1201 VGLFPSNYVK LTTDMDPSQQ * 
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Whole protein sequence 

1 TRGSEGGREE WRRQGRERSL VAP*YGGSRG RIPSGLRDGQ RGGRGWCAGL 
51 RLLRPSQRRV SGTDLSLGRQ RGPARR*GVD *QGKSNRTMA QFPTPFGGSL 
101 DIWAITVEER AKHDQQFHSL KPISGFITGD QARNFFFQSG LPQPVLAQIW 
151 ALADMNNDGR MDQVEFSIAM KLIKLKLQGY QLPSALPPVM KQQPVAISSA 
201 PAFGMGGIAS MPPLTAVAPV PMGSIPVVGM SPTLVSSVPT AAVPPLANGA 
251 PPVIQPLPAF AHPAATLPKS SSFSRSGPGS QLNTKLQKAQ SFDVASVPPV 
301 AEWAVPQSSR LKYRQLFNSH DKTMSGHLTG PQARTILMQS SLPQAQLASI 
35 1 WNLSDIDQDG KLTAEEFILA MHLIDVAMSG QPLPPVLPPE YIPPSFRRVR 
401 SGSGISVISS TSVDQRLPEE PVLEDEQQQL EKKLPVTFED KKRENFERGN 
45 1 LELEKRRQAL LEQQRKEQER LAQLERAEQE RKERERQEQE RKRQLELEKQ 
501 LEKQRELERQ REEERRKEIE RREAAKRELE RQRQLEWERN RRQELLNQRN 
55 1 KEQEDIWLK AKKKTLEFEL EALNDKKHQL EGKLQDIRCR LTTQRQEIES 
601 TNKSRELRIA EITHLQQQLQ ESQQMLGRLI PEKQ1LNDQL KQVQQNSLHR 
651 DSLVTLKRAL EAKELARQHL RDQLDEVEKE TRSKLQEIDI FNNQLKELRE 
701 IHNKQQLQKQ KSMEAERLKQ KEQERKIIEL EKQKEEAQRR AQERDKQWLE 
751 HVQQEDEHQR PRKLHEEEKL KREESVKKKD GEEKGKQEAQ DKLGRLFHQH 
801 QEPAKPAVQA PWSTAEKGPL TISAQENVKV VYYRALYPFE SRSHDEITIQ 
85 1 PGDIVMVKGE WVDESQTGEP GWLGGELKGK TGWFPANYAE KIPENEVPAP 
901 VKPVTDSTSA PAPKLALRET PAPLAVTSSE PSTTPNNWAD FSSTWPTSTN 
951 EKPETDN WDA WAAQPSLTVP SAGQLRQRSA FTPATATGSS PSPVLGQGEK 
1001 VEGLQAQALY PWRAKKDNHL NFNKNDVITV LEQQDMWWFG EVQGQKGWFP 
105 1 KSYVKLISGP IRKSTSMDSG SSESPASLKR VASPAAKPW SGEEFIAMYT 
1 101 YESSEQGDLT FQQGDVILVT KKDGDWWTGT VGDKAGVFPS NYVRLKDSEG 
1151 SGTAGKTGSL GKKPEIAQVI ASYTATGPEQ LTLAPGQLIL IRKKNPGGWW 
1201 EGELQARGKK RQIGWFPANY VKLLSPGTSK ITPTEPPKST ALAAVCQVIG 
1251 MYDYTAQNDD ELAFNKGQII NVLNKEDPDW WKGEVNGQVG LFPSNYVKLT 
1301 TDMDPSQQ*I ICCPSPPQA* KSSKRPTIPY HCPEG*WEMQ P*SCDFQHDH 
1351 LLPSE*KNSL QSSLPHFTLV ACDRNV*VIT CRDRSKNYKN TQGSGSFCGF 
1401 PSYSN*LSPT FAQVLSIVLK LFLNIYFSFL INKINK*LLC YFGFAKRPTI 
1451 KECCMCY*KL FQMSINLRLD VFFHFVQCYQ LNCAVWGFSP LP*KCRGVQY 
1501 LCFKDV*NEP N*SEGVCACL CVSAVPC*AC NTSCT*EISS FHGKAITLYD 
1551 ALIILHLILF CTVTL*PHEK ALCVFVRSQI YLVELVFCLG FLILRVCIA* 
1601 NQ*TTPLRSL RSTISTVSF* SLLHEVLFQL LFME*PILNK *FS*QERMYR 
1651 SLPAINFQCL HFLTRLWNFY RLI*NGAHGP FVC*ICCS*S PVCLLNTSWK 
1701 LSIKMPAAHS TENGAGGASS TI*LSS*RLC NAHSPRVLPA LSGGCAGGRV 
1751 EVLLLSHGAE SEDLSSSFSC TSVFSRPM* NI*iYKPAAL TTVIQPFELV 
1801 PCIDN*ILHT KVKKKKKK 
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I AGAGTGGAGG CGCCAGGGGA GGGAGCGTAG CTTGGTTGCT CCGTAGTACG 
5 1 GCGGCTCGCG AGGAAGAATC CCGAGCGGGC TCCGGGACGG AC AGAGAGGC 
101 GGGCGGGGAT GGTGTGCGGG GCTGCGGCTC CTGCGTCCCT CCCAGCGGCG 
151 CGTGAGCGGC ACTGATTTGT CCCTGGGGCG GCAGCGCGGA CCCGCCCGGA 
201 GATGAGGCGT CGATTAGCAA GGTAAAAGTA ACAGAACCAT GGCTCAGTTT 
251 CCAACACCTT TTGGTGGCAG CCTGGATATC TGGGCCATAA CTGTAGAGGA 
301 AAGAGCGAAG CATGATCAGC AGTTCCATAG TTTAAAGCCA ATATCTGGAT 
351 TCATTACTGG TGATCAAGCT AGAAACTTTT TTTTTCAATC TGGGTTACCT 
401 CAACCTGTTT TAGCACAGAT ATGGGCACTA GCTGACATGA ATAATGATGG 
45 1 AAGAATGGAT C AAGTGGAGT TTTCCATAGC TATGAAACTT ATCAAACTGA 
501 AGCTACAAGG ATATCAGCTA CCCTCTGCAC TTCCCCCTGT CATGAAACAG 
551 CAACCAGTTG CTATTTCTAG CGCACCAGCA TTTGGTATGG GAGGTATCGC 
601 CAGCATGCCA CCGCTTACAG CTGTTGCTCC AGTGCCAATG GGATCCATTC 
65 1 CAGTTGTTGG AATGTCTCCA ACCCTAGTAT CTTCTGTTCC CACAGCAGCT 
701 GTGCCCCCCC TGGCTAACGG GGCTCCCCCT GTTATACAAC CTCTGCCTGC 
751 ATTTGCTCAT CCTGCAGCCA CATTGCCAAA GAGTTCTTCC TTTAGTAGAT 
801 CTGGTCCAGG GTCACAACTA AACACTAAAT TACAAAAGGC ACAGTCATTT 
85 1 G ATGTGGCCA GTGTCCC ACC AGTGGCAGAG TGGGCTGTTC CTCAGTCATC 
901 AAGACTGAAA TACAGGCAAT TATTCAATAG TCATGACAAA ACTATGAGTG 
951 GACACTTAAC AGGTCCCCAA GCAAGAACTA TTCTTATGCA GTCAAGTTTA 
1001 CCACAGGCTC AGCTGGCTTC AATATGGAAT CTTTCTGACA TTGATCAAGA 
105 1 TGGAAAACTT ACAGCAGAGG AATTTATCCT GGCAATGCAC CTCATTGATG 
1 101 TAGCTATGTC TGGCCAACCA CTGCCACCTG TCCTGCCTCC AGAATACATT 
1151 CC ACCTTCTT TTAGAAGAGT TCGATCTGGC AGTGGTATAT CTGTC ATAAG 
1201 CTCAACATCT GTAGATCAGA GGCTACGAGA GGAACCAGTT TTAGAAGATG 
125 1 AACAACAACA ATTAGAAAAG AAATTACCTG TAACGTTTGA AGATAAGAAG 
1301 CGGGAGAACT TTGAACGTGG CAACCTGGAA CTGGAGAAAC GAAGGCAAGC 
1351 TCTCCTGGAA CAGCAGCGCA AGGAGCAGGA GCGCCTGGCC CAGCTGGAGC 
1401 GGGCGGAGCA GGAGAGGAAG GAGCGTGAGC GCCAGGAGCA AGAGCGCAAA 
1451 AGACAACTGG AACTGGAGAA GCAACTGGAA AAGCAGCGGG AGCTAGAACG 
1501 GCAGAGAGAG GAGGAGAGGA GGAAAGAAAT TGAGAGGCGA GAGGCTGCAA 
155 1 AACGGGAACT TGAAAGGCAA CGACAACTTG AGTGGGAACG G AATCGAAGG 
1601 CAAGAACTAC TAAATCAAAG AAACAAAGAA CAAGAGGACA TAGTTGTACT 
165 1 GAAAGCAAAG AAAAAGACTT TGGAATTTGA ATTAGAAGCT CTAAATGATA 
1701 AAAAGCATCA ACTAGAAGGG AAACTTCAAG ATATCAGATG TCGATTGACC 
1751 ACCCAAAGGC AAGAAATTGA GAGCACAAAC AAATCTAGAG AGTTGAGAAT 
1801 TGCCGAAATC ACCCATCTAC AGCAACAATT ACAGGAATCT CAGCAAATGC 
1 85 1 TTGG A AGACT TATTCCAGAA AAACAG ATAC TCAATGACCA ATTAAAACAA 
1901 GTTCAGCAGA ACAGTTTGCA CAGAGATTCA CTTGTTACAC TTAAAAGAGC 
1951 CTTAGAAGCA AAAGAACTAG CTCGGCAGCA CCTACGAGAC CAACTGGATG 
2001 AAGTGGAGAA AGAAACTAGA TCAAAACTAC AGGAGATTGA TATTTTCAAT 
205 1 AATC AGCTGA AGGAACTAAG AGAAATACAC AATAAGCAAC AACTCCAGAA 
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2101 GCAAAAGTCC ATGGAGGCTG AACGACTGAA ACAGAAAGAA CAAGAACGAA 
2151 AGATCATAGA ATTAGAAAAA CAAAAAGAAG AAGCCCAAAG ACGAGCTCAG 
2201 GAAAGGGACA AGCAGTGGCT GGAGCATGTG CAGCAGGAGG ACGAGCATCA 
2251 GAGACCAAGA AAACTCCACG AAGAGGAAAA ACTGAAAAGG GAGGAGAGTG 
2301 TCAAAAAGAA GGATGGCGAG GAAAAAGGCA AACAGGAAGC ACAAGACAAG 
2351 CTGGGTCGGC TTTTCCATCA ACACCAAGAA CCAGCTAAGC CAGCTGTCCA 
2401 GGCACCCTGG TCCACTGCAG AAAAAGGTCC ACTTACCATT TCTGCACAGG 
245 1 AAAATGTAAA AGTGGTGTAT TACCGGGCAC TGTACCCCTT TG AATCCAGA 
2501 AGCCATGATG AAATCACTAT CCAGCCAGGA GACATAGTCA TGGTGGATGA 
255 1 AAGCCAAACT GGAGAACCCG GCTGGCTTGG AGGAGAATTA AAAGGAAAGA 
260 1 CAGGGTGGTT CCCTGCAAAC TATGCAGAGA AAATCCCAG A AAATGAGGTT 
H 265 1 CCCGCTCC AG TGAAACCAGT GACTG ATTCA ACATCTGCCC CTGCCCCC AA 
Jl 2701 ACTGGCCTTG CGTGAGACCC CCGCCCCTTT GGCAGTAACC TCTTCAGAGC 
2* 275 1 CCTCCACGAC CCCTAATAAC TGGGCCGACT TCAGCTCCAC GTGGCCCACC 
y 2801 AGCACGAATG AGAAACCAGA AACGGATAAC TGGGATGCAT GGGCAGCCCA 
W 285 1 GCCCTCTCTC ACCGTTCCAA GTGCCGGCCA GTTAAGGCAG AGGTCCGCCT 
^ 290 1 TTACTCC AGC CACGGCCACT GGCTCCTCCC CGTCTCCTGT GCTAGGCCAG 
W 295 1 GGTGAAAAGG TGGAGGGGCT ACAAGCTCAA GCCCTATATC CTTGGAGAGC 
S 3001 CAAAAAAGAC AACCACTTAA ATTTTAACAA AAATGATGTC ATCACCGTCC 
s - 305 1 TGGAACAGCA AGACATGTGG TGGTTTGGAG AAGTTCAAGG TCAGAAGGGT 

Q 3 101 TGGTTCCCCA AGTCTTACGT GAAACTCATT TCAGGGCCCA TAAGGAAGTC 
M 3151 TACAAGCATG GATTCTGGTT CTTCAGAGAG TCCTGCTAGT CTAAAGCGAG 
Q 320 1 TAGCCTCTCC AGCAGCCAAG CCGGTCGTTT CGGGAGAAGA ATTTATTGCC 
Hi 325 1 ATGTACACTT ACGAGAGTTC TGAGC AAGGA GATTTAACCT TTCAGCAAGG 
Q 3301 GGATGTGATT TTGGTTACCA AGAAAGATGG TGACTGGTGG ACAGGAACAG 
U 3351 TGGGCGACAA GGCCGGAGTC TTCCCTTCTA ACTATGTGAG GCTTAAAGAT 
3401 TCAGAGGGCT CTGGAACTGC TGGGAAAACA GGGAGTTTAG GAAAAAAACC 
3451 TGAAATTGCC CAGGTTATTG CCTCATACAC CGCCACCGGC CCCGAGCAGC 
3501 TCACTCTCGC CCCTGGTCAG CTGATTTTGA TCCGAAAAAA GAACCCAGGT 
3551 GGATGGTGGG AAGGAGAGCT GCAAGCACGT GGGAAAAAGC GCCAGATAGG 
3601 CTGGTTCCCA GCTAATTATG TAAAGCTTCT AAGCCCTGGG ACGAGCAAAA 
3651 TCACTCCAAC AGAGCCACCT AAGTCAACAG CATTAGCGGC AGTGTGCCAG 
3701 GTGATTGGGA TGTACGACTA CACCGCGCAG AATGACGATG AGCTGGCCTT 
3751 CAACAAGGGC CAGATCATCA ACGTCCTCAA CAAGGAGGAC CCTGACTGGT 
3801 GGAAAGGAGA AGTCAATGGA CAAGTGGGGC TCTTCCCATC CAATTATGTG 
3851 AAGCTGACCA CAGACATGGA CCCAAGCCAG CAATGAATCA TATGTTGTCC 
390 1 ATCCCCCCCT CAGGCTTGAA AGTCCTTTTG TGGCTTTCCT AGTTACTC AA 
395 1 ATTGACTTTC CCCCACCTTT GCAC AGGTGC TTTCAATAGT TTTAAAATTA 
4001 TTTTTAAATA TATATTTTAG CTTTTTAATA AACAAAATAA ATAAATGACT 
4051 TCTTTGCTAT TTTGGTTTTG CAAAAAGACC CACTATCAAG GAATGCTGCA 
4101 TGTGCTATTA AAAATTGTTC CAAATGTCCA TAAATCTGAG ACTTGATGTA 
4151 TTTTTTCATT TTGTCCAGTG TTACCAACTA AATTGTGCAG TTTGGGGCTT 
4201 TTCCCCCTTA CCATAGAAGT GCAGAGGAGT TCAGTATCTC TGTTTTAAAG 
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425 1 ACGTATAGAA TGAGCCCAAT TAAAGCGAAG GTGTTTGTGC TTGTTTGTGT 
430 1 GTATC AGCTG TACCTTGTTG AGC ATGTAAT ACATCCTGTA CATAAG AAAT 
435 1 TAGTTCTTTC CATGGCAAAG CTATTACCTT GTACGATGCT CTAATCATAT 
440 1 TGCATTTAAT TTTATTTTGC ACAGTGACCT TGTAGCCACA TG AGAAAGCA 
445 1 CTCTGTGTTT TTGTTCGGTC TCAG ATTTAT CTGGTTGAGT TGGTGTTTTG 
450 1 TTTGGGGTTT TTAATTTTGC GTGTTTGCAT AGCATAAAAT CAGTAGACAA 
4551 CACCACTGAG GTCGTTACGA TCAACGATAT CCACAGTCTC TTTTTAGTCT 
4601 CTGTTACATG AAGTTTTATT CCAGTTACTT TTCATGGAAT GACCTATTTT 
465 1 GAACAAGTAA TTTTCTTGAC AAGAAAGAAT GTATAGAAGT CTCCCTGCAA 
4701 TTAATTTCCA ATGTTTACAT TTTTTAACTA GACTGTGGAA TTTCTACAGA 
475 1 TTAATATG AA ATGGAGCTCA TGGTCCGTTT GTGTGTTAGA TATGCTGTAG 
480 1 CTGAAGCCCT GTTTGTCTTT TAAAC ACTAG TTGGAAGCTC TCAATAAAAA 
485 1 TGCCTGCTGC TCACAGCACA GAAAATGGGG CAGGGGGAGC CTCAAGCACA 
490 1 ATCTAGCTGT CCTCCTAAAG ACTCTGTAAT GCTCACTCCC CTCGCGTTCT 
4951 CCCGGCGCTG TCGGGAGGCT GTGCTGGTGG TCGTGTAGAG GTCCTTCTCC 
5001 TTTCACATGG TGCAGAGAGC GAGGACCTCT CCTCCTCGTT CAGTTGCACT 
505 1 TC AGTATTTT CACGGATATG AATGTAAAAT ATATAAATAT ATAAACCTGC 
5101 GGCTTTAACA ACTGTAATAC AACCTTTTGA ATTAGTTCCG TGTATAGATA 
5151 ATTAAATTCT TCATACAAAA GTTAAAAAAA AAAAAAAAAA AAAAA 
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Translated Protein Sequence *1 1 

1 MAQFPTPFGG SLDIWA1TVE ERAKHDQQFH SLKPISCF[T GDQARNFFFQ 
51 SGLPQPVLAQ IWALADMNND GRMDQVEFSI AMKLIKLKLQ GYQLPSALPP 
101 VMKQQPVAIS SAPAFGMGG1 ASMPPLTAVA PVPMCSIPVV GMSPTLVSSV 
1 5 1 PTA A VPPLAN GAPPVIQPLP AFAHPAATtP KSSSFSRSGP GSQLNTKLQK 
201 AQSFDVASVP PVAEWAVPQS SRLKYRQLFN SHDKTMSGHL TGPQARTTLM 
25 1 QSSLPQAQLA SEWNLSDIDQ DGKLTAEEFI LAMHUDV AM SGQPLPPVLP 
301 PEYIPPSFRR VRSGSGtSVI SSTSVDQRLP EEPVLEDEQQ QLEKKLPVTF 
35 J EDKKRENFER CNLELEJCRRQ ALLEQQRKEQ ERLAQLERAE QERKERERQE 
401 QERKRQLELE KQLEKQRELE RQREEERRKE tERREAAKRE LSRQRQLEWE 
451 RNRRQELLNQ RNKEQEDIVV LKAKKKTLEF ELEALNDKKH QLEGKLQOIR 
501 CRLTTQRQEI ESTNK5RELR lASlTKLQQQ LQESQQMLGR L1PEKQILND 
55 1 QLiCQVQQNSL HRDSLVTLKR ALEAKELARQ HLRDQLDEVE KETRSKLQEr 
601 OiFNNQLKEL REIHNKQQLQ KQKSMEAERL KQKEQERX11 ELEKQKEEAQ 
651 RRAQERDKQ W LEKVQQEDEH QRPRKLHEEE KLKREESVKK KDGEEKGKQE 
701 AQDKLGRLFH QHQEPAKPAV QAPWSTAEKG PLTISAQENV KWYYRALYP 
751 FESRSHDEIT CQPGDIVMVD ESQTGEPGWL GGELKGKTGW FPANYAEKIP 
801 ENEVPAPVKP VTDSTSAPAP KLALRETPAP LAVTSSEPSTTPNNWADFSS 
851 TWPirSTNEKP ETDNWDAWAA QPSLTVPSAG QLRQRSAFTP ATATGSSPSP 
90 1 VLGQGEKVEG LQAQALYPWR. AKKDNHLNFN KNDVITVLEQ QDMWWFGEVQ 
^ 95 1 GQKGWFPKSY VKUSGPIRK STSMDSGSSE SPASLKRVAS PAAKPWSGE 

U 1001 EFIAMYTYES SEQGDLTFQQ GDVILVTKKD GDWWTGTVGD fCAGVFPSNYV 

ifl 1051 RLKDSEGSGT AGKTGSLGKK PEIAQ VIASY TATGPEQLTL APGQLILIRK 

~* 1 101 KNPGGWWEGE LQARGKKRQI GWFPANYVKL LSPGTSKITP TEPPKSTALA 

"~'4 1151 AVCQV1CMYD YTAQNDDELAFNKGQirNVLNKEDPDWWKGEVNGQVGLFP 

pj 1201 SNYVKLTTDM DPSQQ* 

M whole protein sequence: 

W 1 EWRRQGRERS LVAP-YGGSR GRIPSGLROG QRGGRGWCAG LRLLRPSQRR 

5 i VSGTDLSLGR QRGPARR*G V D*QGfCSNRTM AQFPTPFGGS LDIWAITVEE 
r 101 RAKHDQQFHS LKPISGF1TG DQARNFFFQS GLPQPVLAQI WALADMNNDG 

151 RMDQVEFSIA MKLIKLKLQG YQLPSALPPV MKQQPVAISS APAFGMGGIA 
fi 20 1 SMPPLTAV AP VPMGSIP WG MSPTLVSSVP TAAVPPLANG APPVIQPLPA 

r? 25 1 FAHPAATLPK SSSFSRSGPG SQLNTiCLQKA QSFDVASVPP VAEWAVPQSS 

1^ 301 RLKYRQt-FNS HDKTMSGHLT GPQARTILMQ SSLPQAQLAS IWNLSDIDQD 

fi 35 1 GFCLTAEEFIL AMKLIDVAMS GQPLPPVLPP EYIPPSFRRV RSGSGISVTS 

40 1 STSVDQRLPE EPVLEDEQQQ LEKKLPVTFE DKXRENFERG NLELEKRRQA 
l l 4 451 IXEQQRKEQE RLAQLERAEQ ERKERERQEQ ERKRQLELEK QLEKQRELER 

n 501 QR££ERKK£1 ERREAAKRHL ERQRQLEWER NRRQELLNQR NKEQEDIWL 

551 KAKKKTLEFE LEALNDKKHQ LEGKLQDIRC RLTTQRQEIE STNKSRELRI 
H« 60 1 AEtTHLQQQL QESQQMLGRL IPEKQILNDQ LKQ VQQNSLH RDSLVTLKRA 

65 1 LEAKEL ARQH LRDQLDEVEK ETRSKLQEID IFNNQLKELR EIHNKQQLQK 
70 1 QKSME AERLK QKEQERKHE LEKQKEEAQR RAQERDKQ WL EHVQQEDEHQ 
75 1 RPRKLHEEEK LKREESVKKK DGEEKGKQEA QDKLGRLFHQ HQEPAKPAVQ 
801 APWSTAEKGP LTISAQENVK WYYRALYPF ESRSHDE1TI QPGDIVMVDE 
85 1 SQTGEPGWLG GELKGKTGWF PANYAEKIPE NEVPAPVKPV TDSTSAFAPK 
901 LALRETPAPL AVTSSEPSTT PNNWADFSST WPTST^EKPE TDNWDAWAAQ 
95 1 PSLTVPSAGQ LRQRSAFTPA TATGSSPSPV LGQGEKVEGL QAQALYPWRA 
1001 KK0NHLNFNK NOVnVLEQQ DMWWFGEVQG QKGWFPKSYV KUSGPIRKS 
1051 TSMDSGSSES PASLKRVASP AAKPWSGEE FIAMYTYESS EQGDLTFQQG 
1 101 DVILVTKKDG D WWTGTVG DK AGVFPSNYVR LKDSEGSGTA GKTCSLGKKP 
U 5 1 E1AQVIASYT ATGPEQLTLA PCQULiRKK NPGGWWEGEL QARGKKRQIG 
1201 WFPANYVKLL SPGTSICITPT EPPKSTALAA VCQV1GMYDY TAQNDDELAF 
1251 NKCQIINVLN KEDPDWWKGE VNGQVGLFPS NYVKLTTDMD PSQQ-HCCP 
1301 SPPQA-KSFC GFPSYSN*tS PTFAQVLS1V LKLFLN1YFS FLINKINK'L 
1351 LCYFGFAKRP THCECCMCY* KLFQMStNLR LDVFFHFVQC YQLNCAVWGF 
1401 SPLP-KCRGV QYLCFKDVN EPN'SEGVCA CLCVSAVPC- ACNTSCPE! 
1451 SSFHGfCAITL YDAL1ILHL1 LFCTVTL*PH EKALCVFVRS QIYLVELVFC 
1501 LGFLILRVCt A*NQ*TTPLR SLRSTISTVS F*SLtHEVLF QLLFME'PIL 
1551 NK*FS*QERNt YRSLPAIHFQ CUHFLTRL WN FYRL1*NGAH GPFVC*ICCS 
1601 *SPVCLLNTS W*KLStKMPAA HSTENGAGCA SST1*LSS*RLCNAHSPRVL 
165 1 PALSGGCAGG RVEVLLLSHG AESEDLSSSF SCTSVFSRI* M*N1*1YKPA 
170 i ALTTVIQPFE LVPCIDN»IL KTKVKKKKKK K 
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1 CGGGGATGGT GTGCGGGGCT GCGGCTCCTG CGTCCCTCCC AGCGGCGCGT 
51 GAGCGGCACT GATTTGTCCC TGGGGCGGCA GCGCGGACCC GCCCGGAGAT 
101 GAGGCGTCGA TTAGCAAGGT AAAAGTAACA GAACCATGGC TCAGTTTCCA 
151 ACACCTTTTG GTGGCAGCCT GGATATCTGG GCCATAACTG TAGAGGAAAG 
20 1 AGCGAAGCAT GATCAGC AGT TCCATAGTTT AAAGCCAATA TCTGGATTC A 
25 1 TTACTGGTGA TCAAGCTAGA AACTTTTTTT TTCAATCTGG GTTACCTCAA 
301 CCTGTTTTAG CACAGATATG GGCACTAGCT GACATGAATA ATGATGGAAG 
35 1 AATGGATCAA GTGGAGTTTT CCATAGCTAT GAAACTTATC AAACTGAAGC 
401 TACAAGGATA TCAGCTACCC TCTGCACTTC CCCCTGTCAT GAAACAGCAA 
451 CCAGTTGCTA TTTCTAGCGC ACCAGCATTT GGTATGGGAG GTATCGCCAG 
P 501 CATGCCACCG CTTACAGCTG TTGCTCCAGT GCCAATGGGA TCCATTCCAG 
55 1 TTGTTGGA AT GTCTCC AACC CTAGTATCTT CTGTTCCC AC AGCAGCTGTG 
S| 60 1 CCCCCCCTGG CTAACGGGGC TCCCCCTGTT ATAC AACCTC TGCCTGCATT 
v J 65 1 TGCTCATCCT GCAGCC ACAT TGCC AAAG AG TTCTTCCTTT AGTAG ATCTG 
C3 701 GTCCAGGGTC ACAACTAAAC ACTAAATTAC AAAAGGCACA GTCATTTGAT 
*J3 75 1 GTGGCC AGTG TCCCACCAGT GGC AG AGTGG GCTGTTCCTC AGTCATCAAG 
Li J 80 1 ACTG AAATAC AGGCAATTAT TC AATAGTCA TGAC AAAACT ATGAGTGGAC 
J3 85 1 ACTTA ACAGG TCCCC AAGCA AG AACTATTC TTATGCAGTC AAGTTTACCA 
901 CAGGCTCAGC TGGCTTCAAT ATGGAATCTT TCTGACATTG ATCAAGATGG 
H 95 1 AAAACTTACA GCAGAGGAAT TTATCCTGGC AATGCACCTC ATTGATGTAG 
Tl 1001 CTATGTCTGG CCAACCACTG CCACCTGTCC TGCCTCCAGA ATACATTCCA 
^ 1 05 1 CCTTCTTTTA GAAGAGTTCG ATCTGGCAGT GGTATATCTG TCATAAGCTC 
51 1101 AAC ATCTGTA GATC AGAGGC TACC AG AGGA ACC AGTTTTA GAAG ATGAAC 
&i 1151 AACAACAATT AGAAAAGAAA TTACCTGTAA CGTTTGAAGA TAAGAAGCGG 
H 1201 GAGAACTTTG AACGTGGCAA CCTGGAACTG GAGAAACGAA GGCAAGCTCT 
1251 CCTGGAACAG CAGCGCAAGG AGCAGGAGCG CCTGGCCCAG CTGGAGCGGG 
130 1 CGGAGCAGGA GAGGAAGGAG CGTGAGCGCC AGGAGCAAGA GCGCAAAAGA 
135 1 CAACTGGAAC TGGAGAAGCA ACTGGAAAAG CAGCGGGAGC TAGAACGGCA 
1401 GAGAGAGGAG GAGAGGAGGA AAGAAATTGA GAGGCGAGAG GCTGCAAAAC 
145 1 GGGAACTTGA AAGGCAACGA CAACTTG AGT GGGAACGGAA TCGAAGGCAA 
1501 GAACTACTAA ATCAAAGAAA CAAAGAACAA GAGGACATAG TTGTACTGAA 
1 55 1 AGCAAAGAAA AAGACTTTGG AATTTG AATT AGAAGCTCTA AATGATAAAA 
1601 AGCATCAACT AGAAGGGAAA CTTCAAGATA TCAGATGTCG ATTGACCACC 
1 65 1 CAAAGGCAAG AAATTG AG AG C AC A AACAAA TCTAG AG AGT TG AG AATTGC 
1701 CGAAATCACC CATCTACAGC AACAATTACA GGAATCTCAG CAAATGCTTG 
175 1 GAAGACTTAT TCCAGAAAAA CAGATACTCA ATGACCAATT AAAACAAGTT 
1 80 1 CAGC AGAAC A GTTTGC ACAG AGATTC ACTT GTTACACTTA AAAGAGCCTT 
1 85 1 AG AAGCAAAA G AACTAGCTC GGCAGC ACCT ACGAG ACCAA CTGGATGAAG 
1 90 1 TGG AG AAAGA AACTAG ATCA AAACTACAGG AG ATTGATAT TTTCAATAAT 
195 1 CAGCTGAAGG AACTAAGAGA AATACACAAT AAGCAACAAC TCCAGAAGCA 
2001 AAAGTCCATG GAGGCTGAAC GACTGAAACA GAAAGAACAA GAACGAAAGA 
205 1 TCATAGAATT AG AAAAAAAA AAAAAAAAA 
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#5 translated Protein sequence; 



1 MAQFPTPFGG SLDIWAITVE ERAKHDQQFH SLKPISGFIT GDQARNFFFQ 
51 SGLPQPVLAQ IWALADMNND GRMDQVEFSI AMKLIKLKLQ GYQLPSALPP 
101 VMKQQPVAIS SAPAFGMGGI ASMPPLTAVA PVPMGSIPVV GMSPTLVSSV 
151 PTAAVPPLAN GAPPVIQPLP AFAHPAATLP KSSSFSRSGP GSQLNTKLQK 
201 AQSFDVASVP PVAEWAVPQS SRLKYRQLFN SHDKTMSGHL TGPQARTILM 
25 1 QSSLPQAQLA SIWNLSDIDQ DGKLTAEEFI LAMHLIDVAM SGQPLPPVLP 
301 PEYIPPSFRR VRSGSGISVI SSTSVDQRLP EEPVLEDEQQ QLEKKLPVTF 
O 35 1 EDKKRENFER GNLELEKRRQ ALLEQQRKEQ ERLAQLERAE QERKERERQE 
y3 401 QERKRQLELE KQLEKQRELE RQREEERRKE IERREAAKRE LERQRQLEWE 
%J 45 1 RNRRQELLNQ RNKEQEDIVV LKAKKKTLEF ELEALNDKKH QLEGKLQDIR 
fij 501 CRLTTQRQEI ESTNKSRELR IAEITHLQQQ LQESQQMLGR LIPEKQELND 
Q 551 QLKQVQQNSL HRDSLVTLKR ALEAKELARQ HLRDQLDEVE KETRSKLQEI 
3 601 D1FNNQLKEL REIHNKQQLQ KQKSMEAERL KQKEQERKII ELEKKKKK 



whole sequence 



1 RGWCAGLRLL RPSQRRVSGT DLSLGRQRGP ARR*GVD*QG KSNRTMAQFP 
5 1 TPFGGSLDIW AITVEERAKH DQQFHSLKPI SGFITGDQAR NFFFQSGLPQ 
101 PVLAQIWALA DMNNDGRMDQ VEFS1AMKLI KLKLQGYQLP SALPPVMKQQ 
151 PVAISSAPAF GMGGIASMPP LTAVAPVPMG SIPVVGMSPT LVSSVPTAAV 
201 PPLANGAPPV IQPLPAFAHP AATLPKSSSF SRSGPGSQLN TKLQKAQSFD 
251 VASVPPVAEW AVPQSSRLKY RQLFNSHDKT MSGHLTGPQA RTILMQSSLP 
301 QAQLASIWNL SDIDQDGKLT AEEFILAMHL IDVAMSGQPL PPVLPPEYIP 
351 PSFRRVRSGS GISVISSTSV DQRLPEEPVL EDEQQQLEKK LPVTFEDKKR 
401 ENFERGNLEL EKRRQALLEQ QRKEQERLAQ LERAEQERKE RERQEQERKR 
45 1 QLELEKQLEK QRELERQREE ERRKEIERRE AAKRELERQR QLEWERNRRQ 
501 ELLNQRNKEQ EDIVVLKAKK KTLEFELEAL NDKKHQLEGK LQDIRCRLTT 
55 1 QRQEIESTNK SRELRIAEIT HLQQQLQESQ QMLGRLIPEK QILNDQLKQV 
601 QQNSLHRDSL VTLKRALEAK ELARQHLRDQ LDEVEKETRS KLQEIDIFNN 
65 1 QLKELREIHN KQQLQKQKSM EAERLKQKEQ ERKIIELEKK KKK 
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1 GACCACCCAA AGGCAAGAAA TTGAGAGCAC AAACAAATCT AGAGAGTTGA 
51 GAATTGCCGA AATCACCCAT CTACAGCAAC AATTACAGGA ATCTCAGCAA 
1 0 1 ATGCTTGGAA GACTTATTCC AG AAAAACAG ATACTCAATG ACCAATTAAA 
1 5 1 ACAAGTTC AG CAGAACAGTT TGCACAG AG A TTCACTTGTT AC ACTTAAAA 
201 GAGCCTTAGA AGCAAAAGAA CTAGCTCGGC AGCACCTACG AGACCAACTG 
251 GATGAAGTGG AGAAAGAAAC TAGATCAAAA CTACAGGAGA TTGATATTTT 
301 CAATAATCAG CTGAAGGAAC TAAGAGAAAT ACACAATAAG CAACAACTCC 
35 1 AGAAGCAAAA GTCCATGGAG GCTGAACGAC TGAAACAGAA AGAACAAGAA 
40 1 CGAAAGATC A TAGAATTAGA AAAACAAAAA G AAGAAGCCC AAAGACGAGC 
451 TCAGGAAAGG GACAAGCAGT GGCTGGAGCA TGTGCAGCAG GAGGACGAGC 
501 ATCAGAGACC AAGAAAACTC CACGAAGAGG AAAAACTGAA AAGGGAGGAG 
551 AGTGTCAAAA AGAAGGATGG CG AGGAAAAA GGCAAACAGG AAGCACAAGA 
60 1 CAAGCTGGGT CGGCTTTTCC ATCAACACCA AGAACCAGCT AAGCCAGCTG 
651 TCCAGGCACC CTGGTCCACT GCAGAAAAAG GTCCACTTAC CATTTCTGCA 
701 CAGGAAAATG TAAAAGTGGT GTATTACCGG GCACTGTACC CCTTTGAATC 
75 1 CAG AAGCCAT GATGAAATCA CTATCCAGCC AGGAGACATA GTCATGGTGG 
801 ATGAAAGCCA AACTGGAGAA CCCGGCTGGC TTGGAGGAGA ATTAAAAGGA 
851 AAGACAGGGT GGTTCCCTGC AAACTATGCA GAGAAAATCC CAGAAAATGA 
901 GGTTCCCGCT CCAGTGAAAC CAGTGACTGA TTCAACATCT GCCCCTGCCC 
95 1 CCAAACTGGC CTTGCGTGAG ACCCCCGCCC CTTTGGCAGT AACCTCTTCA 
1 00 1 GAGCCCTCCA CGACCCCTAA TAACTGGGCC GACTTCAGCT CCACGTGGCC 
105 1 CACCAGCACG AATGAGAAAC CAGAAACGGA TAACTGGGAT GCATGGGCAG 
1101 CCCAGCCCTC TCTCACCGTT CCAAGTGCCG GCCAGTTAAG GCAGAGGTCC 
1151 GCCTTTACTC CAGCCACGGC CACTGGCTCC TCCCCGTCTC CTGTGCTAGG 
1201 CCAGGGTGAA AAGGTGGAGG GGCTACAAGC TCAAGCCCTA TATCCTTGGA 
125 1 GAGCCAAAAA AGACAACCAC TTAAATTTTA ACAAAAATGA TGTCATCACC 
1301 GTCCTGGAAC AGCAAGACAT GTGGTGGTTT GGAGAAGTTC AAGGTCAGAA 
135 1 GGGTTGGTTC CCCAAGTCTT ACGTGAAACT CATTTCAGGG CCCATAAGGA 
1401 AGTCTACAAG CATGGATTCT GGTTCTTCAG AGAGTCCTGC TAGTCTAAAG 
1451 CGAGTAGCCT CTCCAGCAGC CAAGCCGGTC GTTTCGGGAG AAGAAATTGC 
1501 CCAGGTTATT GCCTCATACA CCGCCACCGG CCCCGAGCAG CTCACTCTCG 
155 1 CCCCTGGTCA GCTGATTTTG ATCCGAAAAA AGAACCCAGG TGGATGGTGG 
1601 GAAGGAGAGC TGCAAGCACG TGGGAAAAAG CGCCAGATAG GCTGGTTCCC 
1651 AGCTAATTAT GTAAAGCTTC TAAGCCCTGG GACGAGCAAA ATCACTCCAA 
1701 CAGAGCCACC TAAGTCAACA GCATTAGCGG CAGTGTGCCA GGTGATTGGG 
175 1 ATGTACGACT ACACCGCGCA GAATGACGAT GAGCTGGCCT TCAACAAGGG 
1801 CCAGATCATC AACGTCCTCA ACAAGGAGGA CCCTGACTGG TGGAAAGGAG 
1 85 1 AAGTC AATGG ACAAGTGGGG CTCTTCCCAT CC AATTATGT GAAGCTGACC 
1901 ACAGACATGG ACCCAAGCCA GCAATGAATC ATATGTTGTC CATCCCCCCC 
1 95 1 TCAGGCTTGA AAGTCCTTTT GTGGCTTTCC TAGTTACTCA A ATTGACTTT 
2001 CCCCCACCTT TGCACAGGTG CTTTCAATAG TTTTAAAATT ATTTTTAAAT 
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2051 ATATATTTTA GCTTTTTAAT AAACAAAATA AATAAATGAC TTCTTTGCTA 
2101 TTTTGGTTTT GCAAAAAGAC CCACTATCAA GGAATGCTGC ATGTGCTATT 
2151 AAAAATTGTT CCAAATGTCC ATAAATCTGA GACTTGATGT ATTTTTTCAT 
2201 TTTGTCCAGT GTTACCAACT AAATTGTGCA GTTTGGGGCT TTTCCCCCTT 
2251 ACCATAGAAG TGCAGAGGAG TTCAGTATCT CTGTTTTAAA GACGTATAGA 
2301 ATGAGCCCAA TTAAAGCGAA GGTGTTTGTG CTTGTTTGTG TGTATCAGCT 
2351 GTACCTTGTT GAGCATGTAA TACATCCTGT ACATAAGAAA TTAGTTCTTT 
2401 CCATGGCAAA GCTATTACCT TGTACGATGC TCTAATCATA TTGCATTTAA 
2451 TTTTATTTTG CACAGTGACC TTGTAGCCAC ATGAGAAAGC ACTCTGTGTT 
n 250 1 TTTGTTCGGT CTCAGATTTA TCTGGTTGAG TTGGTGTTTT GTTTGGGGTT 
^ 255 1 TTTAATTTTG CGTGTTTGCA TAGCATAAAA TCAGTAGACA ACACCACTGA 
J! 260 1 GGTCGTTACG ATCAACGATA TCCACAGTCT CTTTTTAGTC TCTGTTACAT 
f 1 265 1 GAAGTTTTAT TCCAGTTACT TTTCATGGAA TGACCTATTT TGAACAAGTA 
2701 ATTTTCTTGA CAAGAAAGAA TGTATAGAAG TCTCCCTGCA ATTAATTTCC 
275 1 AATGTTTACA TTTTTTAACT AGACTGTGGA ATTTCTACAG ATTAATATGA 
*J 2801 AATGGAGCTC ATGGTCCGTT TGTGTGTTAG ATATGCTGTA GCTGAAGCCC 
N 2851 TGTTTGTCTT TTAAACACTA GTTGGAAGCT CTCAATAAAA ATGCCTGCTG 
* 2901 CTCACAGCAC AGAAAATGGG GCAGGGGGAG CCTCAAGCAC AATCTAGCTG 
1 . 2951 TCCTCCTAAA GACTCTGTAA TGCTCACTCC CCTCGCGTTC TCCCGGCGCT 
M 3001 GTCGGGAGGC TGTGCTGGTG GTCGTGTAAG GTCCTTCTCC TTTCACATGG 
h* 305 1 TGCAGAGAGC GAGGACCTCT CCTCCTCGTT CAGTTGCACT TCAGTATTTT 
Q 3101 CACGGATATG AATGTAAAAT ATATAAATAT ATAAACCTGC GGCTTTAACA 
f\j 3151 ACTGTAATAC AACCTTTTGA ATTAGTTCCG TGTATAG ATA ATTAAATTCT 
p 3201 TCATACAAAA GTTAAAAAAA AAAAAAAAAA A 
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#9 translated protein sequence: 



1 TTQRQEIEST NKSRELRIAE ITHLQQQLQE SQQMLGRLIP EKQILNDQLK 
5 1 QVQQNSLHRD SLVTLKRALE AKELARQHLR DQLDEVEKET RSKLQEIDIF 
101 NNQLKELREI HNKQQLQKQK SMEAERLKQK EQERKIIELE KQKEEAQRRA 
151 QERDKQWLEH VQQEDEHQRP RKLHEEEKLK REESVKKKDG EEKGKQEAQD 
201 KLGRLFHQHQ EPAKPAVQAP WSTAEKGPLT ISAQENVKVV YYRALYPFES 
25 1 RSHDEITIQP GDI VM VDESQ TGEPG WLGGE LKGKTG WFPA NYAEKIPENE 
301 VPAPVKPVTD STSAPAPKLA LRETPAPLAV TSSEPSTTPN NWADFSSTWP 
35 1 TSTNEKPETD N WDA WAAQPS LTVPS AGQLR QRSAFTPATA TGSSPSPVLG 
401 QGEKVEGLQA QALYPWRAKK DNHLNFNKND VITVLEQQDM WWFGEVQGQK 
451 GWFPKSYVKL ISGPIRKSTS MDSGSSESPA SLKRVASPAA KPWSGEEIA 
501 QVIASYTATG PEQLTLAPGQ LILIRKKNPG GWWEGELQAR GKKRQIGWFP 
55 1 ANYVKLLSPG TSKITPTEPP KSTALAA VCQ VIGMYDYTAQ NDDELAFNKG 
601 QIINVLNKED PDWWKGEVNG QVGLFPSNYV KLTTDMDPSQ Q* 

Whole protein sequence 



1 TTQRQEIEST NKSRELRIAE ITHLQQQLQE SQQMLGRLIP EKQILNDQLK 
51 QVQQNSLHRD SLVTLKRALE AKELARQHLR DQLDEVEKET RSKLQEIDIF 
101 NNQLKELREI HNKQQLQKQK SMEAERLKQK EQERKIIELE KQKEEAQRRA 
1 5 1 QERDKQWLEH VQQEDEHQRP RKLHEEEKLK REESVKKKDG EEKGKQEAQD 
201 KLGRLFHQHQ EPAKPAVQAP WSTAEKGPLT ISAQENVKVV YYRALYPFES 
251 RSHDEITIQP GDIVMVDESQ TGEPG WLGGE LKGKTGWFPA NYAEKIPENE 
301 VPAPVKPVTD STSAPAPKLA LRETPAPLAV TSSEPSTTPN NWADFSSTWP 
35 1 TSTNEKPETD NWD A WAAQPS LTVPSAGQLR QRSAFTPATA TGSSPSPVLG 
401 QGEKVEGLQA QALYPWRAKK DNHLNFNKND VITVLEQQDM WWFGEVQGQK 
451 GWFPKSYVKL ISGPIRKSTS MDSGSSESPA SLKRVASPAA KPWSGEEIA 
501 QVIASYTATG PEQLTLAPGQ LILIRKKNPG GWWEGELQAR GKKRQIGWFP 
55 1 ANYVKLLSPG TSKITPTEPP KSTALAA VCQ VIGMYDYTAQ NDDELAFNKG 
601 QIINVLNKED PDWWKGEVNG QVGLFPSNYV KLTTDMDPSQ Q*IICCPSPP 
651 QA*KSFCGFP SYSN*LSPTF AQVLSIVLKL FLNIYFSFLI NKINK*LLCY 
701 FGFAKRPTIK ECCMCY*KLF QMSINLRLDV FFHFVQCYQL NCAVWGFSPL 
751 P*KCRGVQYL CFKDV*NEPN *SEGVCACLC VSAVPC*ACN TSCT*EISSF 
801 HGKAITLYDA LIILHLILFC TVTL*PHEKA LCVFVRSQIY LVELVFCLGF 
85 1 LILRVCIA*N Q*TTPLRSLR STISTVSF*S LLHEVLFQLL FME*PILNK* 
901 FS*QERMYRS LPAINFQCLH FLTRLWNFYR LI*NGAHGPF VC*ICCS*SP 
95 1 VCLLNTSWKL SIKMPAAHST ENGAGGASST I*LSS*RLCN AHSPRVLPAL 
1001 SGGCAGGRVR SFSFHMVQRA RTSPPRSVAL QYFHGYECKI YKYINLRL*Q 
105 1 L*YNLLN*FR V*IIKFFIQK LKKKKK 
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DECLARATION AND POWER OF ATTORNEY FOR P ATENT APPLICATION 

As a below named inventor, I hereby declare that: 

My residence, post office address and citizenship are as stated below under my name. 

I believe that I am the original, first and sole inventor (if only one name is listed 
below) or an original, first and joint inventor (if plural names are listed below) of the subject 
matter which is claimed and for which a patent is sought on the invention entitled 

ISOLATED SH3 GENES ASSOCIATED WITH 
MYELOPROLIFERATIVE DISORDERS AND LEUKEMIA, 
AND USES THEREOF 

the Specification of which 

[X] is attached hereto 

[X] was filed on ADnj_J^_i999 ^ 

ag Action Serial No. ™ ■ 7US99/ 08371 

I hereby state that I have reviewed and understand the contents of the above-identified 
Specification, including the claims, as amended by any amendment referred to above. 

I acknowledge the duty to disclose information which is material to the examination 
of this application in accordance with Title 37, Code of Federal Regulations, 1.56(a). 

I hereby claim foreign priority benefits under Title 35, United States Code, §1 19 (a)- 
(d) or § 365(b) of any foreign appMcation(s) for patent or inventor's certificate, or §365(a) of 
any PCT international application which designated at least one country other than the United 
States of America, listed below and have also identified below, by checking the box, any 
foreign application for patent or inventor's certificate, or of any PCT international application 
having a filing date before that of the application on which priority is claimed. 

PRIOR FORFTfrN FTT ,HD APPLICATIONS ) 
APPLICATION COUNTRY nVTONTH/DAY/YYYY) PRIORITY 
NUMBER CLAIMED 



I hereby claim the benefit under Title 35, United States Code §1 19(e) of any United 
States provisional application(s) listed below. 

APPLICATION NUMBER(S) FTLTNG DATE (MM/DD/YYYY) 



60/082,007 



April 16, 1998 
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I hereby claim the benefit under Title 35, United States Code, §120 of any United 
States application(s), or §365(c) of any PCT international application designating the United 
States of America, listed below and, insofar as the subject matter of each of the claims of this 
application is not disclosed in the prior United States or PCT international application in the 
manner provided by the first paragraph of Title 35, United States Code § 1 12, 1 acknowledge 
the duty to disclose information which is material to patentability as defined in Title 37, Code 
of Federal Regulations §L56 which became available between the filing date of the prior 
application and the national or PCT international filing date of this application. 

U.S. Parent PCT Parent Parent Filing Parent Patent 

Application No. Number fMM/DD/YYYY) Number (if applicable) 

PCT/US99/08371 April 16, 1999 



I hereby appoint as my attorneys or agents the registered persons identified under 

Customer No. 23565 

for the law firm of Klauber & Jackson, said attorneys or agents with full power of substitution 
and revocation to prosecute this application and transact all business in the Patent and 
Trademark Office connected therewith. 

Please address all correspondence regarding this application to Customer No. 23565. 

DAVIIXA. J ACKSON , ESQ. 
^LMJEEEL& JACKSO^L 
411 HACKENSACK AVENUE 
jaACj ^NSAC IL NEW JERSE Y 07601 

Direct all telephone calls to David A. Jackson at (201) 487-5800. 

I hereby declare that all statements made herein of my own knowledge are true and 
that all statements made on information and belief are believed to be true; and further, that 
these statements were made with the knowledge that willful false statements and the like so 
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made are punishable by fine or imprisonment, or both, under Section 1001 of Title 18 of the 
United States Code and that such willful false statements may jeopardize the validity of the 
application or any patent issued thereon. 



FULL NAME OF FIRST OR SOLE INVENTOR: Julie R. Korenberg 
COUNTRY OF CITIZENSHIP: The United States 

FULL RESIDENCE ADDRESS : 8 125 Skyline Drive /> m 

_Ljjs_Ang£l£s, California 90048-1865 {yf() 

FULL POST OFFICE ADDRESS : SAME AS ABOVE 



SIGNATURE OF INVENTO* 
DATE //.ATgD 




FULL NAME OF SECOND JOINT INVENTOR : Xiao-Ning Chen 
COUNTRY OF CITIZENSHIP: The United States 



FULL RESIDENCE ADDRESS : 723 Nicholas Lane 

Arcadia, California 91006 

FULL POST OFFICE ADDRESS: SAME AS ABOVE 




SIGNATURE OF INVENTOR 



DATE. 



